Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix retries for requests with a body #231

Merged
merged 3 commits into from Feb 1, 2020

Conversation

sholladay
Copy link
Collaborator

Fixes #217

I now clone the request before using it so that the body is not consumed and thus fetch() does not freak out when we try to reuse the request during retries, since it can still read the body.

I tried to write a test for this, however it seems to be hard to test in Node, as node-fetch doesn't seem to throw an error under the same circumstances that actual fetch() does in the browser. It should be possible to test this with Puppeteer, though I haven't tried.

@DanielRuf
Copy link

Hm, MDN says:

https://developer.mozilla.org/en-US/docs/Web/API/Request/clone

The clone() method of the Request interface creates a copy of the current Request object.

clone() throws a TypeError if the response Body has already been used. In fact, the main reason clone() exists is to allow multiple uses of Body objects (when they are one-use only.)

If intend to modify the request, you may prefer the Request constructor.

@DanielRuf
Copy link

DanielRuf commented Jan 30, 2020

I think it makes sense to add a unit test which fails without the tests and should succeed with the changes.

Probably related:
JakeChampion/fetch#308

Also see https://stackoverflow.com/questions/34640286/how-do-i-copy-a-request-object-with-a-different-url and https://fetch.spec.whatwg.org/#body

Currently I can not tests it as we already migrated many parts of the code and I do not have the time to test the changes in depth.

@sholladay
Copy link
Collaborator Author

I think it makes sense to add a unit test which fails without the tests and should succeed with the changes.

Well, yes, except that node-fetch doesn't fail when reusing the request. Hard to test for a failure that never happens. Like I said in the PR description, we'll probably need Puppeteer to properly test this, so that a real Request and fetch implementation is used. Otherwise the test will always succeed, even with versions of Ky that have the bug.

@sindresorhus
Copy link
Owner

You could write the test in https://github.com/sindresorhus/ky/blob/master/test/browser.js ?

return request.catch(error_ => error_.toString());
}, server.url);
t.is(error, 'TimeoutError: Request timed out');

// A note from @szmarczak: await server.close() hangs on my machine
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was hanging for me, too. I think because the server's response stream was never being closed. That behavior doesn't seem necessary for this test, so I simply made the Ky timeout less than the server timeout and it works as expected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! ☺️

return request.catch(error_ => error_.toString());
}, server.url);
t.is(error, 'HTTPError: Bad Gateway');
t.is(requestCount, 2);
Copy link
Collaborator Author

@sholladay sholladay Feb 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reasons that I do not understand, the requestCount assertion was failing and I was only able to solve it by changing the status code from 408 to something else. When the status is 408, the server receives roughly double the number of requests I am expecting (i.e. roughly 2x whatever retry is). When I set the retry option to 1, then requestCount was 2 and the assertion passed. When I set retry to 8, then requestCount was 14 and the assertion failed (note: not quite double).

So, of course it seems like a bug in Ky. And sure enough, when I debugged it with await puppeteer.launch({ devtools: true }); in test/helpers/with-page.js, I was able to see these requests in the Network tab and the timings looked like they were respecting our backoff algorithm, so these requests are likely being triggered within Ky.

The thing is, we have no special handling of 408, nor 502 in Ky (other than being retriable), so I don't see how this could be fixed by changing from 408 to 502. I also wasn't able to easily reproduce this behavior manually in Chrome using endpoints from https://httpstat.us/. To me, it's a mystery.

I chose to just change the status code in commit 2808b0f.

@sholladay
Copy link
Collaborator Author

sholladay commented Feb 1, 2020

@sindresorhus I added a test for this. Took a lot of time due to debugging the weird behavior I mentioned in #231 (comment).

@sholladay sholladay merged commit 55b2535 into sindresorhus:master Feb 1, 2020
@sholladay sholladay deleted the clone-request branch March 2, 2020 18:11
@tristanreid
Copy link

Hi there - this is in reference to a Puppeteer issue rather than a KY issue, but I hope it will help anyone struggling with this particular bug:

If you clone a POST request object and use it in a fetch, it will succeed, but if you intercept that request using puppeteer, the postData will be empty. This is causing me an issue because the site that I'm trying to screenshot with puppeteer uses KY, and ever since the fix to this retries issue that introduces request.clone(), the site's POST requests all fail. This is only happening because of the combination of the puppeteer bug along with the fact that I need to proxy requests and therefore use their postData.

I believe the proper fix should be puppeteer changing, not KY, and I don't expect this edge case to impact many people, but it was difficult for me to track down, so I want to share here as well as the bug that I filed over there.

@sholladay
Copy link
Collaborator Author

sholladay commented Jul 10, 2020

That Puppeteer behavior is just... awful. The retries are implicit and seemingly undocumented. The retries are only for certain status codes without a logical pattern (possibly only 408?). They aren't reported or exposed programmatically. They happen in Puppeteer but not Chrome. This has to be someone's idea of making tests more "reliable", giving everyone a false sense of security.

I've had plenty of weird bugs that were hard to debug that I look back on fondly. This is not one of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with Retries after v0.16.0
5 participants