Race condition in query cancellation #3045

rlaffers · 2021-12-03T13:24:12Z

Describe the bug
If the AbortSignal is consumed, a new fetch request initiated by QueryClient.fetchQuery gets cancelled when the previous component unmounts (where the query was used by useQuery).

To Reproduce
Steps to reproduce the behavior:

Go to https://codesandbox.io/s/react-query-cancellation-92jro
Open devtools, select the Network tab.
Click "Colors" link in the app menu.
Check the Network tab. A new request to fruits endpoint is made, then immediately aborted. No data were fetched on the second page.

Expected behavior
As the new component SecondPage mounts, a new request to the fruits endpoint is made and allowed to complete. It should not get cancelled due to the previous component FirstPage unmounting (FirstPage used the same Query by means of useQuery) .

Screenshots

Desktop (please complete the following information):

OS: Ubuntu 20.04
Browser: Chrome
Version: 96

Additional context
For context:
The provided sandbox is a simplified simulation of a real-life app, where the SecondPage component is driven by a state machine (xstate). The state machine calls the queryClient.fetchQuery API during the initialization phase. When the query fails (due to it being aborted), the state machine waits 10 seconds before trying to initialize again. Thus aborting the query creates a needless 10 seconds delay before the component driven by that state machine is rendered.
And why use the fetchQuery API instead of useQuery? We need to use the imperative fetchQuery in the state machine to fetch data at the right time (not on the first render of the component driven by the state machine).
Other scenarios where both useQuery and fetchQuery need to be used across the app are conceivable. The point is, this is not a state machine problem, but rather a problem of racey clean up of observers from an unmounted query having the effect of cancelling requests from the same query used in a freshly mounted component:

// query.ts
removeObserver(observer: QueryObserver<any, any, any, any, any>): void {
    if (this.observers.indexOf(observer) !== -1) {
      this.observers = this.observers.filter(x => x !== observer)

      if (!this.observers.length) {
        // If the transport layer does not support cancellation
        // we'll let the query continue so the result can be cached
        if (this.retryer) {
          if (this.retryer.isTransportCancelable || this.abortSignalConsumed) {
            this.retryer.cancel({ revert: true })
          } else {
            this.retryer.cancelRetry()
          }
        }

        if (this.cacheTime) {
          this.scheduleGc()
        } else {
          this.cache.remove(this)
        }
      }

      this.cache.notify({ type: 'observerRemoved', query: this, observer })
    }
  }

this.retryer.cancel({ revert: true }) cancels the new request made from QueryClient.fetchQuery.

The text was updated successfully, but these errors were encountered:

TkDodo · 2021-12-03T14:14:03Z

As the new component SecondPage mounts, a new request to the fruits endpoint is made and allowed to complete. It should not get cancelled due to the previous component FirstPage unmounting (FirstPage used the same Query by means of useQuery) .

I think this is exactly the problem. They both "use" the same query, but they do not both "observe" it. react-query doesn't differentiate between who initiated the fetch. For example, if you have one observer, and you trigger a fetch via a window focus, and then you unmount the observer because you go to a different page, we also cancel that fetch because there is no one interested in that data (if the signal was consumed).

I think this is similar to what happens in your case:

you trigger a fetch via fetchQuery (my example: window focus)
then the observer unmounts by a page navigation
according to the observers, no one is "interested" in the data, so the fetch will be aborted.

yes, the problem could be a "race-condition", but with useQuery, we currently have no other option than to subscribe and unsubscribe in a useEffect,so we are tied to what react offers us here. With react 18, we will move towards useSyncExternalStorage, but I also have no power over the timing then.

one solution would be to defer the query cancellation by a bit to see if a new subscriber mounts before actually cancelling it. This has also been pointed out by the react core team in a discussion about strict effects, because in strict mode and react 18, all effects will fire twice, which might lead to subscribe/unsubscribe/cancel chains, too. But it's a bit hard to judge what "a bit" would be here.

You've said that you only useLayoutEffect to demonstrate what happens in the state machine, but unless it also uses useLayoutEffect, I don't see how this can happen before the old page unmounts? Can you elaborate on what you mean with:

The state machine calls the queryClient.fetchQuery API during the initialization phase

what is the initialization phase please, and how can this happen on a page before the previous page unmounts?

rlaffers · 2021-12-04T21:21:39Z

Yes, your analysis of why this happens is correct. I can also see the current difficulty in determining who initiated the ongoing fetch request (either the observing component being unmounted, or a fetchQuery call from the next component which has just mounted).

what is the initialization phase please, and how can this happen on a page before the previous page unmounts?

The initialization phase is just an Initialization state which is the first state the state machine enters when it is started. I'm attaching this state diagram to clarify the state machine's operation (simplified): https://stately.ai/viz/95727c7e-9e82-4b9c-a029-bbcb0d78c1d3
There is a side effect executed upon entering the Initialization state, namely fetching the fruit.

To be clear, the state machine interpreter (xstate) does make use of useLayoutEffect. Conceptually, it works like this:

function SecondPage() {
  const service = interpret(stateMachine)  // the machine is "interpreted" immediately on this first render. The stateMachine blueprint is used to instantiate a ready-to-start service instance (memoized).
  useLayoutEffect(() => {
    service.start() // the machine is started and enters its first state - Initialization
    // the fetch request is in flight now
  }, [])

I am not deeply familiar with React internal workings. Apparently, it first executes the layout effect from the SecondPage before cleaning up effects from the FirstPage (unmounted).

one solution would be to defer the query cancellation by a bit to see if a new subscriber mounts before actually cancelling it.

Deferring the query cancellation may not help here, since AFAIK QueryClient.fetchQuery call is not going to create an observer. The cleanup effect would still see no observers. And like you say, using small timeouts to beat the React effect cleanup timings would be sketchy anyway.
How about flagging the Query when QueryClient.fetchQuery is called, and check this flag before request cancellation? If the flag is on, the request should not be cancelled.

TkDodo · 2021-12-04T21:50:46Z

right, so the machine uses useLayoutEffect, while we useEffect. That would explain the discrepancies.

I am not deeply familiar with React internal workings. Apparently, it first executes the layout effect from the SecondPage before cleaning up effects from the FirstPage (unmounted).

That is right. Have a look at this sandbox. You can see that the layoutEffect in component two runs before the cleanup of the useEffect in component one.

If both were to use the same effects (both useEffect or both useLayoutEffect), the problem wouldn't happen. Now the question is who is doing it right / wrong 🤔 . I think we don't use layout effects because they will give a warning during server-side rendering. I also don't think layout effects are "meant" for data fetching, but I could be wrong.

Maybe you can find out why the machine uses a layout effect?

That being said, with react 18, we'll move away from fetching in an effect towards useSyncExternalStorage. Can you maybe have a look at the draft PR? It has a codesandbox preview build attached. Maybe you can try it out with this dependency:

"react-query": "https://pkg.csb.dev/tannerlinsley/react-query/commit/f7aab99e/react-query",

beware the breaking changes (query keys need to be arrays).

I am not sure how the timing of things will change with that, but maybe it'll solve the issue.

on a different note, I think the preferred way of using react-query together with xstate is / will be via the input (previously syncToContext proposal):

const Component = (props) => {
  const [result] = useQuery();

  const [state, send] = useMachine(machine, {
    input: {
      // props.id and result.data will be kept up to date
      // in the machine above
      id: props.id,
      data: result.data
    }
  });
};

TkDodo · 2021-12-04T21:52:25Z

Also, if both xstate and react-query move towards uSES, the problem will likely go away:

[@xstate/react] Leverage useSyncExternalStore() now that React 18 is in beta statelyai/xstate#2821

rlaffers · 2021-12-08T09:48:56Z

I posted a related question in the xstate forum.

TkDodo · 2022-02-26T19:12:11Z

As it stands, I think this is more an issue with xstate than react-query. If we all move towards useSyncExternalStorage, as is the plan, we will have the same timing.

TkDodo closed this as completed Feb 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition in query cancellation #3045

Race condition in query cancellation #3045

rlaffers commented Dec 3, 2021

TkDodo commented Dec 3, 2021

rlaffers commented Dec 4, 2021

TkDodo commented Dec 4, 2021

TkDodo commented Dec 4, 2021

rlaffers commented Dec 8, 2021

TkDodo commented Feb 26, 2022

Race condition in query cancellation #3045

Race condition in query cancellation #3045

Comments

rlaffers commented Dec 3, 2021

TkDodo commented Dec 3, 2021

rlaffers commented Dec 4, 2021

TkDodo commented Dec 4, 2021

TkDodo commented Dec 4, 2021

rlaffers commented Dec 8, 2021

TkDodo commented Feb 26, 2022