Time out initial fetch, and go to account-picker screen #4166

chrisbobbe · 2020-06-18T03:07:18Z

N.B.: Extracting promiseTimeout into a helper function, as @gnprice suggested here, led me to add some time-elapsed arithmetic near the call site so we're not too generous and give each attempt a full 60 seconds. We shrink timeLimitMs with each successive call.

`tryUntilSuccessful` is about to grow a timeout, which we've had in mind in particular for the initial fetch. It's also about to be moved into this file, and we want to keep its association with `doInitialFetch` by putting it directly above it, with nothing in between. Removing this call site lets us do so without ESLint complaining of `no-use-before-define`. And Greg says it's all the same to him [1], especially since `fetchPrivateMessages` is doomed to be deleted anyway, as the JSDoc says. [1]: zulip#4165 (comment)

chrisbobbe · 2020-06-18T03:13:08Z

Oh, goodness—I do need to add tests for promiseTimeout, I think. And probably tryFetch; I removed the tests for that under its previous name, tryUntilSuccessful, partly because they weren't really doing much. Glad to see that there are already tests for isClientError.

Does that sound right?

gnprice · 2020-06-18T23:36:23Z

Yeah, let's have tests for promiseTimeout. 🙂 Then for tryFetch, two of the three old tests were pretty trivial, but the third tests that it does retry. It'd also be good to test its new timeout functionality, and the existing isClientError feature.

gnprice

Thanks @chrisbobbe ! This looks like a good solution. Some comments below.

gnprice · 2020-06-19T00:20:09Z

src/utils/async.js

+ * Returns a new Promise with the same outcome as `promise`, if
+ * `promise` completes in time.
+ *
+ * If `promise` does not complete before `timeLimitMs` has passed,
+ * `onTimeout` is called, and its outcome is used as the outcome of
+ * the returned Promise.


I like this term "outcome". It's a good way of conveying this should apply to completion and failure uniformly.

src/utils/async.js

gnprice · 2020-06-19T00:26:13Z

src/utils/async.js

+    if (e instanceof TimeoutError && onTimeout !== undefined) {
+      return onTimeout();


This feels like ideally it belongs at the same place as the new TimeoutError(), and replacing that in the onTimeout case. I think that would simplify the code, because this whole try/catch would then get to go away.

IOW, the whole try/catch could become just

return await Promise.race([promise, timeoutPromise]);

with a suitable version of timeoutPromise that itself handles the onTimeout.

(And then in fact the await could be deleted, along with the async: just return Promise.race(…);.)

Ah, I thought of doing it that way—but then, IIUC, it's not a race between (A) the outcome of promise and (B) the timer (which is what it says on the tin), but rather, between (A) the outcome of promise and (B) the timer plus the outcome of a Promise returned by onTimeout.

Mm, bumping this, @gnprice, as I don't think we got around to it on the call (we were mostly talking about #4166 (comment)).

Concretely, I think you're suggesting something like this:

const timeoutPromise = new Promise((resolve, reject) => setTimeout( () => (onTimeout !== undefined ? resolve(onTimeout()) : reject(new TimeoutError())), timeLimitMs, ), ); return Promise.race([promise, timeoutPromise]);

But this gives promise too much time to complete; it now has timeLimitMs plus however long a Promise returned by onTimeout takes. Worse, if this gives enough time for promise to complete, then we're in a weird in-between state where onTimeout has fired (i.e., the time is properly up) but the outcome is the outcome of promise.

resolve(onTimeout()) might look a bit weird if onTimeout returns a Promise; I'm not used to doing that. But I think this excerpt supports the idea that the outcome of onTimeout() (the Promise returned by onTimeout—which may take arbitrarily long to arrive) is assigned to be the outcome of timeoutPromise.

It's from a big green note here, in a section on the Promise constructor in an ECMAScript spec (formatting lost):

The resolve function that is passed to an executor function accepts a single argument. The executor code may eventually call the resolve function to indicate that it wishes to resolve the associated Promise object. The argument passed to the resolve function represents the eventual value of the deferred action and can be either the actual fulfillment value or another Promise object which will provide the value if it is fulfilled.

I'll include a test for this case and you can play around with it if you have doubts. 🙂

Hmm, good point. And good thought to add a test for it. 🙂

It still feels like this logic is sort of awkwardly split between the new TimeoutError() and the later catch and if (e instanceof TimeoutError && onTimeout !== undefined). But evidently it'd take some deeper thinking about these promise and timer semantics to fix that than I'd thought. And there isn't that much to gain -- the behavior seems correct as it is, and the code is already short and not hard to understand.

gnprice · 2020-06-19T00:32:15Z

src/message/fetchActions.js

+      if (isClientError(e) || e instanceof TimeoutError) {
+        throw e;
+      }
+      await backoffMachine.wait();


Ideally this would also be covered by the logic expressed by timeLimitMs above.

Hmm, true—see one possible way this might look, at #4166 (comment).

gnprice · 2020-06-19T00:34:15Z

src/message/fetchActions.js

+  const timeAtStart = Date.now();
+  // eslint-disable-next-line no-constant-condition
+  while (true) {
+    const msElapsed = Date.now() - timeAtStart;
+    // If the first attempt had a 60-second timeout, and it took 3
+    // seconds, then the second attempt should have a 57-second
+    // timeout. Etc.
+    const timeLimitMs = MAX_TIME_MS - msElapsed;
+    try {
+      return await promiseTimeout<T>(func(), timeLimitMs);
+    } catch (e) {
+      if (isClientError(e) || e instanceof TimeoutError) {
+        throw e;
+      }
+      await backoffMachine.wait();
+    }
+  }


Looking at this and particularly with the comment I just made about the backoffMachine.wait(), I wonder: with the spiffy new promiseTimeout helper, could all this timeout logic be simplified by using that around the whole thing? I think it could.

Hmm, if I understand correctly, something like this?

export async function tryFetch<T>(func: () => Promise<T>): Promise<T> { const MAX_TIME_MS: number = 60000; const backoffMachine = new BackoffMachine(); return promiseTimeout<T>( (async () => { // eslint-disable-next-line no-constant-condition while (true) { try { return await func(); } catch (e) { if (isClientError(e) || e instanceof TimeoutError) { throw e; } await backoffMachine.wait(); } } // Without this, Flow 0.92.1 does not know this code is unreachable, // and it incorrectly thinks Promise<undefined> could be returned, // which is inconsistent with the stated Promise<T> return type. // eslint-disable-next-line no-unreachable throw new Error(); })(), MAX_TIME_MS, ); }

Oh, hmm, that's not quite it—the infinite loop could continue infinitely, while being ignored, even after the timer was up. I'll see what I can think of.

Er, there's probably a more elegant solution, but this might be working code:

export async function tryFetch<T>(func: () => Promise<T>): Promise<T> { const MAX_TIME_MS: number = 60000; const backoffMachine = new BackoffMachine(); let timerHasExpired = false; return promiseTimeout( (async () => { // eslint-disable-next-line no-constant-condition while (true) { if (timerHasExpired) { // No one is listening for this Promise's outcome, so stop // doing more work. throw new Error(); } try { return await func(); } catch (e) { if (isClientError(e)) { throw e; } await backoffMachine.wait(); } } // Without this, Flow 0.92.1 does not know this code is unreachable, // and it incorrectly thinks Promise<undefined> could be returned, // which is inconsistent with the stated Promise<T> return type. // eslint-disable-next-line no-unreachable throw new Error(); })(), MAX_TIME_MS, () => { timerHasExpired = true; throw new TimeoutError(); } ); }

src/message/fetchActions.js

We'll use the `async` ones [1] in upcoming commits (the non-`async` `runAllTimers` is added for completeness, since we add its `async` counterpart). I'm not really sure how, but these will fix issues where the expected behavior in a non-Lolex world isn't observed. I suspect this is relevant: """ The `runAllAsync()` will also break the event loop, allowing any scheduled promise callbacks to execute before running the timers. """ and the similar comment on `runToLastAsync`. [1]: https://github.com/sinonjs/fake-timers#clockrunall--await-clockrunallasync

As Greg says [1], it's always been closely tied to `doInitialFetch`'s specific needs, with its handling of 4xx errors. We're about to make it more so, by adding timeout logic. Also, remove two pretty trivial tests, and adapt the third (that tests that a retry is done) to use Lolex. Add another test to confirm that a 4xx error does indeed halt the retry loop and get thrown. [1]: zulip#4165 (comment)

To be dispatched when we time out attempts to connect to a server that isn't responding, or when a server responds with a 5xx error. It navigates to the 'accounts' screen so a user can try a different account and server. Logging out wouldn't be good; the credentials may be perfectly fine, and we'd like to keep them around to try again later. We also need to ensure `needsInitialFetch` is set to `false` [1], while avoiding the semantics (and a navigation to the 'main' screen, which depends on data from the initial fetch!) of `INITIAL_FETCH_COMPLETE`. See also discussion [2]. [1]: As also noted in 7caa4d0, `needsInitialFetch` is "edge-triggered": """ [I]t's consulted inside a `componentDidUpdate` method, which means it typically will cause a `doInitialFetch` call only when it's *updated* to true, not if it's continuously true. """ This logic seems complex and fragile, and it would be nice to fix that. [2]: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Stuck.20on.20loading.20screen/near/907591

As Greg points out [1], this makes the most sense conceptually; it's happening at the bottom of the loop, just before a new iteration starts. The `return` in the `try` block is enough to ensure that this wait won't interfere with a successful fetch. [1]: zulip#4166 (comment)

We'll use this in `tryFetch`, in an upcoming commit, so that we can interrupt in-progress attempts to contact a server when they take too long. See discussion [1]. [1]: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Stuck.20on.20loading.20screen/near/907693

Using `INITIAL_FETCH_ABORT` and `promiseTimeout`, which we set up in the last few commits, abort and go to the accounts screen if the initial fetch is taking too long. Fixes: zulip#4165

As Greg points out [1], this makes the most sense conceptually; it's happening at the bottom of the loop, just before a new iteration starts. The `return` in the `try` block is enough to ensure that this wait won't interfere with a successful fetch. [1]: zulip#4166 (comment)

As hinted by the comment above this edit, even an empty array is incomplete server data: "Valid server data must have a user: the self user, at a minimum.". But the initial state of `users` (`initialState` in src/users/usersReducer.js) is an empty array. LOGOUT, LOGIN_SUCCESS, and ACCOUNT_SWITCH all set `users` to that initial state. Those three actions are handled in `navReducer` by navigating to routes that don't depend on complete server data: 'loading' or 'account'. But if we quit the app with `users` having this initial state (e.g., giving up in frustration while trying to connect to a realm for a long time), we should be sure that, on reopening it and rehydrating, we don't try to go to 'main' before that initial state has been replaced with a complete state. Otherwise, we see "No server data found" errors and a white screen. See also discussion [1]. [1]: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4166.20Time.20out.20initial.20fetch/near/912017

chrisbobbe · 2020-06-23T19:07:25Z

It'd also be good to test its new timeout functionality, and the existing isClientError feature.

It looks like there are already tests for isClientError; did you have more thoroughness in mind?

I just pushed my latest revision, so this is ready for another review.

gnprice

Thanks for the revisions! Comments below.

gnprice · 2020-06-24T22:03:50Z

src/message/__tests__/fetchActions-test.js

+      };
+
+      const tryFetchPromise = tryFetch(async () => {
+        await new Promise(r => setTimeout(r, 10));


Clearer as sleep, I think: await sleep(10);.

(Ditto below.)

gnprice · 2020-06-24T22:14:25Z

src/utils/__tests__/promiseTimeout-test.js

+
+  test('If `promise` resolves with `x` before time is up, resolves with `x`, `onTimeout` not called', async () => {
+    const x = Math.random();
+    const quickPromise = new Promise(resolve => setTimeout(() => resolve(x), 1));


Staring at this a bit to unpack it, I think it means the same thing as

const quickPromise = sleep(1).then(() => x);

Does that seem right? I think that would be easier to understand.

gnprice · 2020-06-24T22:15:06Z

src/utils/__tests__/promiseTimeout-test.js

+    const quickPromise = new Promise((_, reject) =>
+      setTimeout(() => reject(new Error(x.toString())), 1),
+    );


And this would be sleep(1).then(() => { throw new Error(x.toString()); }).

gnprice · 2020-06-24T22:27:38Z

src/utils/__tests__/promiseTimeout-test.js

+    const onTimeout = jest.fn();
+
+    const quickPromiseWithTimeout = promiseTimeout(quickPromise, ONE_MINUTE_MS, onTimeout);
+    lolex.runOnlyPendingTimers();


Can we use runAllTimers or runAllTimersAsync for all of these?

In general that seems easiest to think about -- it means we're running the whole thing to completion, and when we inspect the results at the end of the test there's no possibility we're seeing an intermediate state where there's still something more the code would do later if we let it.

There are cases where that isn't possible -- basically where there's an infinite loop -- but it seems worth reserving the greater care of thinking about intermediate states for those cases.

gnprice · 2020-06-24T22:38:06Z

src/__tests__/lib/lolex.js

+  async runOnlyPendingTimersAsync(): Promise<void> {
+    this._clock.runToLastAsync();
+  }
+
+  runAllTimers(): void {
+    this._clock.runAll();
+  }
+
+  async runAllTimersAsync(): Promise<void> {
+    this._clock.runAllAsync();
+  }
+


It looks like in this version you don't end up using runOnlyPendingTimersAsync.

I suspect the key thing that was needed here wasn't the "async", but the "all". In the tests that use runAllTimersAsync, there's an onTimeout callback which... as its name predicts, gets called only after a timeout, and which then creates a new timeout. So runOnlyPendingTimers won't run that new timeout, because it wasn't already pending when runOnlyPendingTimers was invoked.

This is probably a sign that we should indeed make a habit of just using the "all" versions all the time 🙂, as I suggested in another comment just now, outside of the specific cases where there's a reason we can't.

gnprice · 2020-06-24T22:42:39Z

src/utils/__tests__/promiseTimeout-test.js

+      const endlessPromise = new Promise((resolve, reject) => {});
+


This can be pulled out to the top of the describe -- that'll make most of these tests a couple of lines shorter.

They're all doing very similar things, so the more we can shorten the repeating parts and highlight the parts that actually differ, the easier it is to see what-all is being tested.

gnprice · 2020-06-24T22:47:20Z

src/utils/__tests__/promiseTimeout-test.js

+      const endlessPromiseWithTimeout = promiseTimeout(endlessPromise, ONE_MINUTE_MS, onTimeout);
+      await lolex.runAllTimersAsync();
+
+      const result = await endlessPromiseWithTimeout;


We have this pattern a lot in our Lolex-using tests, of:

create a promise

run timers

await the promise.

The usual idiom is to combine "create promise" and "await promise" on the same line without giving the promise a name; but that isn't possible here because the "run timers" step needs to go in between.

How about pulling this pattern out as a function, perhaps named like withAllTimers? So e.g.

const result = await withAllTimers(promiseTimeout(endlessPromise, ONE_MINUTE_MS, onTimeout));

I think that'd help simplify a lot of these tests, and further contribute to isolating the parts that are interesting and different between them.

gnprice · 2020-06-24T22:50:03Z

src/utils/__tests__/promiseTimeout-test.js

+      await endlessPromiseWithTimeout;
+
+      expect(onTimeout).toHaveBeenCalledTimes(1);
+      expect(onTimeout).toHaveBeenCalledWith();


This part is redundant with Flow, I think -- the implementation of promiseTimeout wouldn't pass the type-checker if it tried to pass arguments to a callback of type () => T | Promise<T>.

As Greg points out [1], this makes the most sense conceptually; it's happening at the bottom of the loop, just before a new iteration starts. The `return` in the `try` block is enough to ensure that this wait won't interfere with a successful fetch. [1]: zulip#4166 (comment)

chrisbobbe · 2020-08-12T05:58:27Z

Closing, as this revision has grown old and doesn't have working tests—#4193 is working on that.

As Greg points out [1], this makes the most sense conceptually; it's happening at the bottom of the loop, just before a new iteration starts. The `return` in the `try` block is enough to ensure that this wait won't interfere with a successful fetch. [1]: zulip#4166 (comment)

chrisbobbe requested a review from gnprice June 18, 2020 03:07

chrisbobbe added the P1 high-priority label Jun 18, 2020

chrisbobbe force-pushed the pr-initial-fetch-timeout branch from ea9f61e to 1dcc18f Compare June 18, 2020 03:38

gnprice reviewed Jun 19, 2020

View reviewed changes

chrisbobbe added 3 commits June 23, 2020 11:42

chrisbobbe mentioned this pull request Jun 23, 2020

Use AbortController to abort fetches and more. #4170

Open

chrisbobbe force-pushed the pr-initial-fetch-timeout branch from 1dcc18f to a647c8e Compare June 23, 2020 18:59

chrisbobbe added 4 commits June 23, 2020 12:03

initial fetch: Add a timeout to abort.

4d1d235

Using `INITIAL_FETCH_ABORT` and `promiseTimeout`, which we set up in the last few commits, abort and go to the accounts screen if the initial fetch is taking too long. Fixes: zulip#4165

chrisbobbe force-pushed the pr-initial-fetch-timeout branch from a647c8e to 785232e Compare June 23, 2020 19:05

gnprice reviewed Jun 24, 2020

View reviewed changes

chrisbobbe mentioned this pull request Jul 16, 2020

Initial fetch timeout (with working tests) #4193

Closed

chrisbobbe closed this Aug 12, 2020

chrisbobbe mentioned this pull request Aug 12, 2020

Prep for initial-fetch timeout. #4226

Merged

chrisbobbe deleted the pr-initial-fetch-timeout branch November 5, 2021 00:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time out initial fetch, and go to account-picker screen #4166

Time out initial fetch, and go to account-picker screen #4166

chrisbobbe commented Jun 18, 2020 •

edited

chrisbobbe commented Jun 18, 2020 •

edited

gnprice commented Jun 18, 2020

gnprice left a comment

gnprice Jun 19, 2020

gnprice Jun 19, 2020

chrisbobbe Jun 19, 2020

chrisbobbe Jun 23, 2020

chrisbobbe Jun 23, 2020 •

edited

chrisbobbe Jun 23, 2020 •

edited

chrisbobbe Jun 23, 2020 •

edited

chrisbobbe Jun 23, 2020

gnprice Jun 24, 2020

gnprice Jun 19, 2020

chrisbobbe Jun 22, 2020

gnprice Jun 19, 2020

chrisbobbe Jun 19, 2020

chrisbobbe Jun 19, 2020 •

edited

chrisbobbe Jun 19, 2020 •

edited

chrisbobbe commented Jun 23, 2020

gnprice left a comment

gnprice Jun 24, 2020

gnprice Jun 24, 2020

gnprice Jun 24, 2020

gnprice Jun 24, 2020

gnprice Jun 24, 2020

gnprice Jun 24, 2020

gnprice Jun 24, 2020

gnprice Jun 24, 2020

chrisbobbe commented Aug 12, 2020

		if (e instanceof TimeoutError && onTimeout !== undefined) {
		return onTimeout();

		const endlessPromise = new Promise((resolve, reject) => {});

Time out initial fetch, and go to account-picker screen #4166

Time out initial fetch, and go to account-picker screen #4166

Conversation

chrisbobbe commented Jun 18, 2020 • edited

chrisbobbe commented Jun 18, 2020 • edited

gnprice commented Jun 18, 2020

gnprice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisbobbe Jun 23, 2020 • edited

Choose a reason for hiding this comment

chrisbobbe Jun 23, 2020 • edited

Choose a reason for hiding this comment

chrisbobbe Jun 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisbobbe Jun 19, 2020 • edited

Choose a reason for hiding this comment

chrisbobbe Jun 19, 2020 • edited

Choose a reason for hiding this comment

chrisbobbe commented Jun 23, 2020

gnprice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisbobbe commented Aug 12, 2020

chrisbobbe commented Jun 18, 2020 •

edited

chrisbobbe commented Jun 18, 2020 •

edited

chrisbobbe Jun 23, 2020 •

edited

chrisbobbe Jun 23, 2020 •

edited

chrisbobbe Jun 23, 2020 •

edited

chrisbobbe Jun 19, 2020 •

edited

chrisbobbe Jun 19, 2020 •

edited