New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
js: handle 408 fetch requests pending status #967
Conversation
@jnmoyne use case was batch == 2 IIRC? |
c52f9c2
to
556d8be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to rationalize what those 408 various responses mean from the server perspective and how to react to those. I am not clear on that.
js.go
Outdated
@@ -2558,7 +2573,8 @@ func (sub *Subscription) Fetch(batch int, opts ...PullOpt) ([]*Msg, error) { | |||
// are no messages. | |||
msg, err = sub.nextMsgWithContext(ctx, true, false) | |||
if err != nil { | |||
if err == errNoMessages { | |||
// We skip both 404 and any type of 408 errors at this stage. | |||
if err == errNoMessages || err == errRequestsPending { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have no effect. We just called sub.nextMsgWithContext(), which is a low-level NATS call, but since we indicate that this is an internal pull call, it will return errNoMessages if there was no message and not go into a wait. You can check that function for details, but there is no way this function returns errRequestsPending since I don't see that you added that outside of checkMsg().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the review, removed the check since it had no effect
js.go
Outdated
err = ErrTimeout | ||
// In case of a fetch request with no wait request and expires time, | ||
// it will be a 408 error but with a Requests Pending description. | ||
desc := msg.Header.Get(descrHdr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From looking at server code, there are several possible descriptions for 408 (so far..): "Interest Expired", "Requests Pending", "Request Canceled" and "Request Timeout".
We need to think if we need to behave any differently for those error status or not. I am not sure. Depending on that answer, we can decide if we need to distinguish description or not.. (again, there are more than that one, and some may be added in the future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the rest of the status at the moment but I think one way to simplify this could be to refactor the code to skip 408s unless making the initial expires + no wait batch request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of those mean the request is no longer valid and no additional messages will be sent for that request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to just check that it was a 408 from a no wait request instead of checking that it is a Requests Pending
, this will have the same effect and avoid the timeouts that @jnmoyne was running into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Skip 408 errors thrown to client no wait + expires requests Signed-off-by: Waldemar Quevedo <wally@nats.io>
For fetch batches of n > 1, sometimes the client would receive a 408 Requests Pending status messages for Fetch requests (initial request with no wait + expires time) which resulted in the client skipping these errors until reaching the deadline.
In this PR we add special handling for this status so that the client makes a request without no_wait after getting a 408 status with requests pending.
Signed-off-by: Waldemar Quevedo wally@nats.io