Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time out initial fetch, and go to account-picker screen #4166

Closed
wants to merge 8 commits into from

Conversation

chrisbobbe
Copy link
Contributor

@chrisbobbe chrisbobbe commented Jun 18, 2020

Fixes: #4165

N.B.: Extracting promiseTimeout into a helper function, as @gnprice suggested here, led me to add some time-elapsed arithmetic near the call site so we're not too generous and give each attempt a full 60 seconds. We shrink timeLimitMs with each successive call.

`tryUntilSuccessful` is about to grow a timeout, which we've had in
mind in particular for the initial fetch. It's also about to be
moved into this file, and we want to keep its association with
`doInitialFetch` by putting it directly above it, with nothing in
between.

Removing this call site lets us do so without ESLint complaining of
`no-use-before-define`. And Greg says it's all the same to him [1],
especially since `fetchPrivateMessages` is doomed to be deleted
anyway, as the JSDoc says.

[1]: zulip#4165 (comment)
@chrisbobbe chrisbobbe requested a review from gnprice June 18, 2020 03:07
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Jun 18, 2020

Oh, goodness—I do need to add tests for promiseTimeout, I think. And probably tryFetch; I removed the tests for that under its previous name, tryUntilSuccessful, partly because they weren't really doing much. Glad to see that there are already tests for isClientError.

Does that sound right?

@gnprice
Copy link
Member

gnprice commented Jun 18, 2020

Yeah, let's have tests for promiseTimeout. 🙂 Then for tryFetch, two of the three old tests were pretty trivial, but the third tests that it does retry. It'd also be good to test its new timeout functionality, and the existing isClientError feature.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chrisbobbe ! This looks like a good solution. Some comments below.

Comment on lines +10 to +15
* Returns a new Promise with the same outcome as `promise`, if
* `promise` completes in time.
*
* If `promise` does not complete before `timeLimitMs` has passed,
* `onTimeout` is called, and its outcome is used as the outcome of
* the returned Promise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this term "outcome". It's a good way of conveying this should apply to completion and failure uniformly.

src/utils/async.js Outdated Show resolved Hide resolved
Comment on lines +38 to +35
if (e instanceof TimeoutError && onTimeout !== undefined) {
return onTimeout();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like ideally it belongs at the same place as the new TimeoutError(), and replacing that in the onTimeout case. I think that would simplify the code, because this whole try/catch would then get to go away.

IOW, the whole try/catch could become just

  return await Promise.race([promise, timeoutPromise]);

with a suitable version of timeoutPromise that itself handles the onTimeout.

(And then in fact the await could be deleted, along with the async: just return Promise.race(…);.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I thought of doing it that way—but then, IIUC, it's not a race between (A) the outcome of promise and (B) the timer (which is what it says on the tin), but rather, between (A) the outcome of promise and (B) the timer plus the outcome of a Promise returned by onTimeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mm, bumping this, @gnprice, as I don't think we got around to it on the call (we were mostly talking about #4166 (comment)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concretely, I think you're suggesting something like this:

  const timeoutPromise = new Promise((resolve, reject) =>
    setTimeout(
      () => (onTimeout !== undefined ? resolve(onTimeout()) : reject(new TimeoutError())),
      timeLimitMs,
    ),
  );
  return Promise.race([promise, timeoutPromise]);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this gives promise too much time to complete; it now has timeLimitMs plus however long a Promise returned by onTimeout takes. Worse, if this gives enough time for promise to complete, then we're in a weird in-between state where onTimeout has fired (i.e., the time is properly up) but the outcome is the outcome of promise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve(onTimeout()) might look a bit weird if onTimeout returns a Promise; I'm not used to doing that. But I think this excerpt supports the idea that the outcome of onTimeout() (the Promise returned by onTimeout—which may take arbitrarily long to arrive) is assigned to be the outcome of timeoutPromise.

It's from a big green note here, in a section on the Promise constructor in an ECMAScript spec (formatting lost):

The resolve function that is passed to an executor function accepts a single argument. The executor code may eventually call the resolve function to indicate that it wishes to resolve the associated Promise object. The argument passed to the resolve function represents the eventual value of the deferred action and can be either the actual fulfillment value or another Promise object which will provide the value if it is fulfilled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll include a test for this case and you can play around with it if you have doubts. 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. And good thought to add a test for it. 🙂

It still feels like this logic is sort of awkwardly split between the new TimeoutError() and the later catch and if (e instanceof TimeoutError && onTimeout !== undefined). But evidently it'd take some deeper thinking about these promise and timer semantics to fix that than I'd thought. And there isn't that much to gain -- the behavior seems correct as it is, and the code is already short and not hard to understand.

if (isClientError(e) || e instanceof TimeoutError) {
throw e;
}
await backoffMachine.wait();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this would also be covered by the logic expressed by timeLimitMs above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, true—see one possible way this might look, at #4166 (comment).

Comment on lines 241 to 257
const timeAtStart = Date.now();
// eslint-disable-next-line no-constant-condition
while (true) {
const msElapsed = Date.now() - timeAtStart;
// If the first attempt had a 60-second timeout, and it took 3
// seconds, then the second attempt should have a 57-second
// timeout. Etc.
const timeLimitMs = MAX_TIME_MS - msElapsed;
try {
return await promiseTimeout<T>(func(), timeLimitMs);
} catch (e) {
if (isClientError(e) || e instanceof TimeoutError) {
throw e;
}
await backoffMachine.wait();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this and particularly with the comment I just made about the backoffMachine.wait(), I wonder: with the spiffy new promiseTimeout helper, could all this timeout logic be simplified by using that around the whole thing? I think it could.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if I understand correctly, something like this?

export async function tryFetch<T>(func: () => Promise<T>): Promise<T> {
  const MAX_TIME_MS: number = 60000;
  const backoffMachine = new BackoffMachine();

  return promiseTimeout<T>(
    (async () => {
      // eslint-disable-next-line no-constant-condition
      while (true) {
        try {
          return await func();
        } catch (e) {
          if (isClientError(e) || e instanceof TimeoutError) {
            throw e;
          }
          await backoffMachine.wait();
        }
      }
      // Without this, Flow 0.92.1 does not know this code is unreachable,
      // and it incorrectly thinks Promise<undefined> could be returned,
      // which is inconsistent with the stated Promise<T> return type.
      // eslint-disable-next-line no-unreachable
      throw new Error();
    })(),
    MAX_TIME_MS,
  );
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hmm, that's not quite it—the infinite loop could continue infinitely, while being ignored, even after the timer was up. I'll see what I can think of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, there's probably a more elegant solution, but this might be working code:

export async function tryFetch<T>(func: () => Promise<T>): Promise<T> {
  const MAX_TIME_MS: number = 60000;
  const backoffMachine = new BackoffMachine();

  let timerHasExpired = false;

  return promiseTimeout(
    (async () => {
      // eslint-disable-next-line no-constant-condition
      while (true) {
        if (timerHasExpired) {
          // No one is listening for this Promise's outcome, so stop
          // doing more work.
          throw new Error();
        }
        try {
          return await func();
        } catch (e) {
          if (isClientError(e)) {
            throw e;
          }
          await backoffMachine.wait();
        }
      }
      // Without this, Flow 0.92.1 does not know this code is unreachable,
      // and it incorrectly thinks Promise<undefined> could be returned,
      // which is inconsistent with the stated Promise<T> return type.
      // eslint-disable-next-line no-unreachable
      throw new Error();
    })(),
    MAX_TIME_MS,
    () => {
      timerHasExpired = true;
      throw new TimeoutError();
    }
  );
}

src/message/fetchActions.js Outdated Show resolved Hide resolved
src/message/fetchActions.js Outdated Show resolved Hide resolved
We'll use the `async` ones [1] in upcoming commits (the non-`async`
`runAllTimers` is added for completeness, since we add its `async`
counterpart).

I'm not really sure how, but these will fix issues where the
expected behavior in a non-Lolex world isn't observed. I suspect
this is relevant:

"""
The `runAllAsync()` will also break the event loop, allowing any
scheduled promise callbacks to execute before running the timers.
"""

and the similar comment on `runToLastAsync`.

[1]: https://github.com/sinonjs/fake-timers#clockrunall--await-clockrunallasync
As Greg says [1], it's always been closely tied to
`doInitialFetch`'s specific needs, with its handling of 4xx errors.
We're about to make it more so, by adding timeout logic.

Also, remove two pretty trivial tests, and adapt the third (that
tests that a retry is done) to use Lolex. Add another test to
confirm that a 4xx error does indeed halt the retry loop and get
thrown.

[1]: zulip#4165 (comment)
To be dispatched when we time out attempts to connect to a server
that isn't responding, or when a server responds with a 5xx error.

It navigates to the 'accounts' screen so a user can try a different
account and server. Logging out wouldn't be good; the credentials
may be perfectly fine, and we'd like to keep them around to try
again later.

We also need to ensure `needsInitialFetch` is set to `false` [1],
while avoiding the semantics (and a navigation to the 'main' screen,
which depends on data from the initial fetch!) of
`INITIAL_FETCH_COMPLETE`. See also discussion [2].

[1]: As also noted in 7caa4d0, `needsInitialFetch` is
     "edge-triggered":

     """
     [I]t's consulted inside a `componentDidUpdate` method, which
     means it typically will cause a `doInitialFetch` call only when
     it's *updated* to true, not if it's continuously true.
     """

     This logic seems complex and fragile, and it would be nice to
     fix that.

[2]: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Stuck.20on.20loading.20screen/near/907591
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jun 23, 2020
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
We'll use this in `tryFetch`, in an upcoming commit, so that we can
interrupt in-progress attempts to contact a server when they take
too long. See discussion [1].

[1]: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Stuck.20on.20loading.20screen/near/907693
Using `INITIAL_FETCH_ABORT` and `promiseTimeout`, which we set up in
the last few commits, abort and go to the accounts screen if the
initial fetch is taking too long.

Fixes: zulip#4165
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
As hinted by the comment above this edit, even an empty array is
incomplete server data: "Valid server data must have a user: the
self user, at a minimum.".

But the initial state of `users` (`initialState` in
src/users/usersReducer.js) is an empty array. LOGOUT, LOGIN_SUCCESS,
and ACCOUNT_SWITCH all set `users` to that initial state.

Those three actions are handled in `navReducer` by navigating to
routes that don't depend on complete server data: 'loading' or
'account'.

But if we quit the app with `users` having this initial state (e.g.,
giving up in frustration while trying to connect to a realm for a
long time), we should be sure that, on reopening it and rehydrating,
we don't try to go to 'main' before that initial state has been
replaced with a complete state. Otherwise, we see "No server data
found" errors and a white screen.

See also discussion [1].

[1]: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4166.20Time.20out.20initial.20fetch/near/912017
@chrisbobbe
Copy link
Contributor Author

It'd also be good to test its new timeout functionality, and the existing isClientError feature.

It looks like there are already tests for isClientError; did you have more thoroughness in mind?

I just pushed my latest revision, so this is ready for another review.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revisions! Comments below.

};

const tryFetchPromise = tryFetch(async () => {
await new Promise(r => setTimeout(r, 10));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearer as sleep, I think: await sleep(10);.

(Ditto below.)


test('If `promise` resolves with `x` before time is up, resolves with `x`, `onTimeout` not called', async () => {
const x = Math.random();
const quickPromise = new Promise(resolve => setTimeout(() => resolve(x), 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Staring at this a bit to unpack it, I think it means the same thing as

const quickPromise = sleep(1).then(() => x);

Does that seem right? I think that would be easier to understand.

Comment on lines +36 to +38
const quickPromise = new Promise((_, reject) =>
setTimeout(() => reject(new Error(x.toString())), 1),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this would be sleep(1).then(() => { throw new Error(x.toString()); }).

const onTimeout = jest.fn();

const quickPromiseWithTimeout = promiseTimeout(quickPromise, ONE_MINUTE_MS, onTimeout);
lolex.runOnlyPendingTimers();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use runAllTimers or runAllTimersAsync for all of these?

In general that seems easiest to think about -- it means we're running the whole thing to completion, and when we inspect the results at the end of the test there's no possibility we're seeing an intermediate state where there's still something more the code would do later if we let it.

There are cases where that isn't possible -- basically where there's an infinite loop -- but it seems worth reserving the greater care of thinking about intermediate states for those cases.

Comment on lines +89 to +100
async runOnlyPendingTimersAsync(): Promise<void> {
this._clock.runToLastAsync();
}

runAllTimers(): void {
this._clock.runAll();
}

async runAllTimersAsync(): Promise<void> {
this._clock.runAllAsync();
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like in this version you don't end up using runOnlyPendingTimersAsync.

I suspect the key thing that was needed here wasn't the "async", but the "all". In the tests that use runAllTimersAsync, there's an onTimeout callback which... as its name predicts, gets called only after a timeout, and which then creates a new timeout. So runOnlyPendingTimers won't run that new timeout, because it wasn't already pending when runOnlyPendingTimers was invoked.

This is probably a sign that we should indeed make a habit of just using the "all" versions all the time 🙂, as I suggested in another comment just now, outside of the specific cases where there's a reason we can't.

Comment on lines +51 to +52
const endlessPromise = new Promise((resolve, reject) => {});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be pulled out to the top of the describe -- that'll make most of these tests a couple of lines shorter.

They're all doing very similar things, so the more we can shorten the repeating parts and highlight the parts that actually differ, the easier it is to see what-all is being tested.

Comment on lines +96 to +99
const endlessPromiseWithTimeout = promiseTimeout(endlessPromise, ONE_MINUTE_MS, onTimeout);
await lolex.runAllTimersAsync();

const result = await endlessPromiseWithTimeout;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this pattern a lot in our Lolex-using tests, of:

  • create a promise
  • run timers
  • await the promise.

The usual idiom is to combine "create promise" and "await promise" on the same line without giving the promise a name; but that isn't possible here because the "run timers" step needs to go in between.

How about pulling this pattern out as a function, perhaps named like withAllTimers? So e.g.

const result = await withAllTimers(promiseTimeout(endlessPromise, ONE_MINUTE_MS, onTimeout));

I think that'd help simplify a lot of these tests, and further contribute to isolating the parts that are interesting and different between them.

await endlessPromiseWithTimeout;

expect(onTimeout).toHaveBeenCalledTimes(1);
expect(onTimeout).toHaveBeenCalledWith();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is redundant with Flow, I think -- the implementation of promiseTimeout wouldn't pass the type-checker if it tried to pass arguments to a callback of type () => T | Promise<T>.

chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jul 16, 2020
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jul 16, 2020
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
@chrisbobbe
Copy link
Contributor Author

Closing, as this revision has grown old and doesn't have working tests—#4193 is working on that.

@chrisbobbe chrisbobbe closed this Aug 12, 2020
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Aug 25, 2020
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Aug 25, 2020
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Oct 27, 2020
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jan 11, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Feb 8, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Apr 13, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Apr 13, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request May 19, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request May 20, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request May 21, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jun 2, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jun 25, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Jul 15, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
gnprice pushed a commit to chrisbobbe/zulip-mobile that referenced this pull request Jul 16, 2021
As Greg points out [1], this makes the most sense conceptually; it's
happening at the bottom of the loop, just before a new iteration
starts. The `return` in the `try` block is enough to ensure that
this wait won't interfere with a successful fetch.

[1]: zulip#4166 (comment)
@chrisbobbe chrisbobbe deleted the pr-initial-fetch-timeout branch November 5, 2021 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Time out initial fetch, and go to account-picker screen
2 participants