Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Lingering Timeout #10569

Merged
merged 2 commits into from Apr 8, 2024
Merged

Conversation

rtribotte
Copy link
Member

@rtribotte rtribotte commented Apr 4, 2024

What does this PR do?

This PR introduces the respondingTimeouts.lingeringTimeout option for entry points, with a default value of 2s.

The lingering timeout defines the maximum duration between each TCP read operation.
As a layer 4 timeout, it applies during HTTP handling but respects the respondingTimeouts.readTimeout option configuration.

The default value is purposely narrowed and can close the connection too early.
This could be breaking for "server-first" protocols.
We suggest to adapt this value accordingly to your situation.

This PR also deprecates the respondingTimeouts.<timeout> options:

  • <entryPoint>.transport.respondingTimeouts.readTimeout
  • <entryPoint>.transport.respondingTimeouts.writeTimeout
  • <entryPoint>.transport.respondingTimeouts.idleTimeout

They have been replaced by:

  • <entryPoint>.transport.respondingTimeouts.http.readTimeout
  • <entryPoint>.transport.respondingTimeouts.http.writeTimeout
  • <entryPoint>.transport.respondingTimeouts.http.idleTimeout

Motivation

This change avoids Traefik instances with the default configuration hanging while waiting for bytes to be read on the connection.
This has been identified to be an issue with:

Fixes #10448.
Superseeds #10531

More

  • Added/updated tests
  • Added/updated documentation

Additional Notes

Co-authored-by: Baptiste Mayelle baptiste.mayelle@traefik.io
Co-authored-by: Kevin Pollet pollet.kevin@gmail.com

Copy link
Member

@mmatur mmatur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@lbenguigui lbenguigui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@juliens juliens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@traefiker traefiker merged commit cef8422 into traefik:v2.11 Apr 8, 2024
22 checks passed
@ngbrown
Copy link

ngbrown commented Apr 11, 2024

Can someone expand more on what is meant by "between each TCP read operation"? Is Traefik monitoring the TCP packet acks from the service? Or is it the delay between received packets from the client on the ingress port? Can more information be provided on how this is technically measured?

This change also seems breaks the AMQP TCP protocol, and AMQP over WebSockets. Setting the value to 0 cures the problems, but if I knew what the above phrase meant, I could try other values.

Edit: The documentation verbiage seems to be related to the go net package documentation for SetReadDeadline(). It's really not descriptive enough as to what is going on for a non-go programmer.

A better description would describe what is happening at a more physical level. Like: "this timeout is the maximum delay between received packets". Is this an accurate description? Does it affect both directions?

@yashgorana
Copy link

Can someone please help me understand why this new timeout is (1) introduced as a patch, (2) has such a small value of 2s knowing that it will break systems and more importantly (3) why it isn't an opt-in feature? This PR triggered a bunch of issues reported in #10596, #10595, #10589

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

7 participants