Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support per-request cache with SSR #461

Closed
Ephem opened this issue May 6, 2020 · 12 comments
Closed

Support per-request cache with SSR #461

Ephem opened this issue May 6, 2020 · 12 comments
Labels
enhancement New feature or request merge pending

Comments

@Ephem
Copy link
Collaborator

Ephem commented May 6, 2020

This is in a sense a continuation of the now closed #70

First of all, thanks for your hard work on this library, it's great! I'm currently designing a new somewhat complex data fetching solution at work and I would like to build it on top of react-query, but there is some functionality around SSR that I would like to discuss.

I think the call to not cache data at all in #70 was the right one given the circumstances, for security reasons there can not be a cache by default on the server since it might leak user data cross-requests. However, this has some cumbersome consequences:

  • Using queryCache.prefetchQuery ahead of the server rendering (like in Next.js getInitialProps) wont automatically prime the cache
  • Calling useQuery with initialData in one place wont prime the cache for use in a second place, so reading the same data from the cache in multiple places becomes impossible

A big point of react-query is to be the global cache for data. Since this is not true on the server, a lot of nice patterns falls apart, and the only way to fix that is to implement your own cache on the server that wraps react-query, for example using a custom useQueryWithSSRCache that passes some initialData every time it is used. This seems a bit backwards since it's re-implementing a somewhat big part of react-query.

My suggestion is to make using a cache on the server opt-in and design it around creating a new cache per request, that you place on a context provider. If a cache is available on the context, use that instead of the global one.

The other part of the puzzle is to make that cache serialisable (and you probably want to destroy the entire cache when you do so), so you can send it to the client and hydrate it there. On the client you could either hydrate the global cache before rendering, or create a cache and place on a context like on the server (but then you can't import queryCache like normal).

This is also something that will very likely be needed to support Suspense server rendering on the server in the future, if you don't have a cache per request, what do you read from to see if you need to Suspend or not? I'm not sure if this is something you are interested in, or if you deem it out of scope for the library? From the Readme:

Query caches are not written to memory during SSR. This is outside of the scope of React Query and easily leads to out-of-sync data when used with frameworks like Next.js or other SSR strategies.

(Btw, I'm curious about what you mean by out of sync here?)

I'd love to get your input on this! This is a complex area so even if this became somewhat long, there is still a lot of nuance here of course, I'm very open to discussing it further and might very well be interested in contributing to something like this as well.

@tannerlinsley
Copy link
Collaborator

Go on, go on.. I'm listening 😉 I definitely want to figure this out.

So, are you suggesting something like const queryCache = makeQueryCache() and then allowing the query cache to be passed down through the config provider? I like this idea, as long as it doesn't interfere with the current non-SSR way of doing things (so defaulting to the global singleton if the config provided cache isn't available). The weird thing here on the client would be that if you can make your own query cache singleton, then your app can't just import the queryCache that is provided by RQ. It would need to import your special one instead. I find this slightly annoying, but as long as we can prove that this is an edge case or something that would only be used on the client for testing (maybe?), then I would feel better about it. As for using it on the server... that makes a lot of sense. But then using the config provider would be required and... blah blah blah... it starts to get messy.

I do want to fix this though.

@Ephem
Copy link
Collaborator Author

Ephem commented May 6, 2020

API design

You got it exactly right and there would be no breaking changes.

You asked for it, so here comes a braindump! 😉 I just typed everything out in one go before heading to dinner, so this is not exhaustive or necessarily super thought-through, but hopefully it's good starting point. As I said, if there is interest and this seems like a promising direction I can try to find some time to take a stab at a PoC to look at.

Example

A possible API could look something like this:

// server.js

async function handleRequest(req, res) {
  const queryCache = makeQueryCache();

  queryCache.prefetch('something', fetchSomething); // This would usually happen in a framework or be abstracted somehow

  const markup = ReactDOMServer.renderToString(
    <ReactQueryCacheProvider cache={queryCache}> // Could also use existing provider
      <App />
    </ReactQueryCacheProvider>
  );

  // This should probably be named something else and just return a raw object that the user can choose to serialize themselves however they want, to keep the react-query lean
  const serializedCache = queryCache.serialize(); 

  res.send(`
${someHtmlTemplateStart}
<body>
<div id="root">{markup}</div>
<script id="initial_payload" type="application/json" charset="utf-8">
  { REACT_QUERY_CACHE: ${serializedCache} }
</script>
${someHtmlTemplateEnd}
`);
}
// client.js

import { queryCache } from 'react-query';

const initialPayload = JSON.parse(
  document.getElementById('initial_payload').textContent
);

/* -- Alternative 1: Hydrating to global cache -- */
queryCache.hydrate(initialPayload.REACT_QUERY_CACHE);
ReactDOM.hydrate(<App />, document.getElementById('root'));

/* -- Alternative 2: Using cache provider -- */
const clientQueryCache = makeQueryCache(initialPayload.REACT_QUERY_CACHE);
ReactDOM.hydrate(
  <ReactQueryCacheProvider cache={clientQueryCache}>
    <App />
  </ReactQueryCacheProvider>
, document.getElementById('root'));

These are all pretty common patterns when working with SSR (though it's more common today to use a Provider on the client than some module-based global state).

Server rendering

Usually when you use import { queryCache } from 'react-query'; directly, it's in effects and event handlers that don't run on the server anyway. For any parts where you need direct access to the queryCache on the server, you can access it from context or pass it in explicitly from the handleRequest above:

  • When prefetching before the ReactDOMServer.renderToString
  • When reading from the cache in the react-query hooks
  • In the future: When using Suspense to suspend the server rendering (also in the hooks)

Also, IF you have provided a cache via context on the server, { initialData } should work as on the client as well and prime that cache for other calls to the same key.

This should also work with streaming out of the box. The future streaming Suspense server renderer, combined with Progressive hydration, might require figuring out things like streaming serialization and hydration though which should be doable, but who knows? Another unknown is how React Blocks would play into this, but that's also pretty experimental for now as I understand it.

Client

Providing the queryCache via context on the client would be completely optional and could even be hidden away in the docs. This possibility could possibly unlock interesting things in itself though, for example using separate caches when using multiple React roots to isolate applications. Sometimes I imagine you would want a shared cache, sometimes you wouldn't.

I haven't looked enough at the source, so imagine supporting multiple caches in different roots or different parts of an application would require more work as well which might or might not be worth it, but the thing about the "provide cache via provider"-part is that you get it for free anyway when building out the SSR-support. 😄

Since we can place the queryCache on a Provider with this approach, we might as well provide a hook useQueryCache as well that access it since that's extremely cheap to do and has no downsides really.

Serializing the cache to localStorage and hydrating it on reload (for scroll restoration) could be another usecase that this would support btw.

Supporting different frameworks

Next.js

Even though the examples are "custom SSR low level" ones, I'm pretty sure we can make it work with Next.js as well. Return the serialized cache from getInitialProps, create a cache from that client-side and put it on a context like alternative 2 above. We might be able to provide alternative 1 as well by having a hook that can kind of provide initialData for the entire cache, but I have some unknowns there. with-apollo-and-redux seems like a good example to look at.

I think this should work similarly with the new getServerSideProps and getStaticProps APIs there as well, but there are some caveats. getServerSideProps gets executed on the server on a page transition, so there would need to be a way to "add to" the existing cache instead of hydrating it from scratch. react-query already supports that for single queries as I understand it, so should be no problem, but there might be ways to improve the DX.

Gatsby

I'm not as well versed in Gatsby so would need to do some research, but I see no reason the approach shouldn't work there as well since it's a common pattern.

React Router v6 (alpha)

RR6 has a new preload-function that get's called before transitioning to the route. I haven't checked, but since RR is not a metaframework and has always been "low level" when it comes to SSR I imagine you call this manually on the server before rendering, so you can probably just use the pattern from the example and pass in queryCache yourself. There are nuances around how to pass that in the same way on the client if you are not using a global cache, but it should work.

Remix

I haven't checked out the previews fully yet and there are probably still unknowns, so same as Gatsby.

@tannerlinsley tannerlinsley added the enhancement New feature or request label May 11, 2020
Ephem pushed a commit to Ephem/react-query that referenced this issue Jun 10, 2020
Add makeServerQueryCache as a way to create a queryCache that caches data on the server.
Add queryCache.dehydrate as a way to dehydrate the cache into a serializeable format.
Add initialQueries as an option to makeQueryCache as a way to rehydrate/create a warm cache.

Closes TanStack#461
Ephem added a commit to Ephem/react-query that referenced this issue Jun 10, 2020
Add makeServerQueryCache as a way to create a queryCache that caches data on the server.
Add queryCache.dehydrate as a way to dehydrate the cache into a serializeable format.
Add initialQueries as an option to makeQueryCache as a way to rehydrate/create a warm cache.

Closes TanStack#461
@Ephem
Copy link
Collaborator Author

Ephem commented Jun 14, 2020

Just to document the progress:

#476 added ReactQueryCacheProvider 🎉

#570 was a first attempt to bring the other pieces together, a great discussion led to some insights, as well as breaking the PR apart in pieces:

  • Cache data on the server with manually created caches #584 - Enables caching on the server for caches created by makeQueryCache (Breaking)
  • Future PR - De/rehydration should have a functional approach (eg const dehydratedQueries = dehydrate(queryCache)/rehydrate(queryCache, dehydratedQueries)) and (probably?) have its own entry point to keep payload down for main lib
  • Future PR - Examples and new SSR-guide in docs

Just to be clear, the example API-design in the last message is outdated, but the general approach is still valid.

@PepijnSenders
Copy link
Contributor

@Ephem I tried to use your new API's with Next.js + SSG; you might want to take a look: https://github.com/PepijnSenders/react-query-next-ssg-example. I still had to manually hydrate the cache, but otherwise, it looks quite amazing!

@Ephem
Copy link
Collaborator Author

Ephem commented Jun 18, 2020

@PepijnSenders That's awesome! I have limited internet/computer access until next week so I can't look in detail, but at a glance this looks very close to how I've imagined it, just with a bit more boilerplate the de/rehydrate-APIs are meant to solve. 🎉

If you are up for it, I definitely think that example should become the official one when the APIs are fully there?

Btw, just to be clear, #476 which does the heavy lifting here was created by @jackmellis ❤️

@PepijnSenders
Copy link
Contributor

Sure, I can help with that! Enjoy your time away :)

@tannerlinsley
Copy link
Collaborator

Next should take care of this. Huzzah!

@pseudo-su
Copy link

Should still be open? or is there another issue tracking the de/rehydration?

@Ephem
Copy link
Collaborator Author

Ephem commented Jul 25, 2020

@pseudo-su You could say the title of this issue has been resolved, but not parts of the description of it. No other issue is currently tracking hydration, but a WIP PR is up in #728.

@pseudo-su
Copy link

pseudo-su commented Jul 26, 2020

Oh fantastic, very interested in this feature. I'm looking into using react-query for the first time and trying to understand if I can use it for my use-case.

It looks like your PR would implement the main thing I'm looking for 🤩. Looks like there's quite a lot of reference to NextJS here so in case providing an alternative adds to the discussion, the ideal react-query Provider API for my use-use woudl be along the same lines as this react-jobs example. react-jobs is a bit old and maybe react-ssr-prepass would be used more now, but the gist is to provide an example of what it would like like using react-query with the react SSR functionality directly rather than through a framework.

const jobContext = createJobContext()

  // 👇 Ensure you wrap your application with the provider.
  const app = (
    <JobProvider jobContext={jobContext}>
      <MyApp />
    </JobProvider>
  )

  // 👇 This makes sure we "bootstrap" resolve any jobs prior to rendering
  asyncBootstrapper(app).then(() => {
      // We can now render our app 👇
      const appString = renderToString(app)

      // Get the resolved jobs state. 👇
      const jobsState = jobContext.getState()

      const html = `
        <html>
          <head>
            <title>Example</title>
          </head>
          <body>
            <div id="app">${appString}</div>
            <script type="text/javascript">
              // Serialise the state into the HTML response
              //                                 👇
              window.JOBS_STATE = ${serialize(jobsState)}
            </script>
          </body>
        </html>`

      res.send(html)
    });
import React from 'react'
import { render } from 'react-dom'
import { JobProvider } from 'react-jobs'

import MyApp from './shared/components/MyApp'

// Get any "rehydrate" state sent back by the server
//                               👇
const rehydrateState = window.JOBS_STATE

// Surround your app with the JobProvider, providing
// the rehydrateState
//     👇
const app = (
  <JobProvider rehydrateState={rehydrateState}>
    <MyApp />
  </JobProvider>
)

// Render 👍
render(app, document.getElementById('app'))

@pseudo-su
Copy link

pseudo-su commented Jul 26, 2020

I might be getting ahead of myself and maybe these make sense as seperate features after the ReactQueryCacheProvider is done, but maybe I'm just excited after seeing some of the cache/staleness settings in the useQuery API. I wanted to mention these thoughts because I think they're worth considering when thinking about SSR support for a library like react-query that implements a queryCache.

Setting freshness/staleness settings based query response

I might be wrong but as far as I can tell there doesn't seem to be a way to set the freshness/staleness settings of a query in the queryCache using the response values of the query promise.

As I understand, this means that if you make a fetch() request for data and pass it to useQuery() you have to decide what the staleTime in ms should be upfront. I think this leads to the following potentially undesireable situations

Unnecessary re-triggering of query function

You might decide to make the staleTime 5 minutes, if the fetch() request then returns a Cache-Control: max-age=1800 (30 minutes) then the staleTime in the queryCache won't know the "real" staleness value. This would cause more re-running of the query functions.

Stale data being cached too long

You might decide to make the staleTime 30 minutes, if the fetch() request then returns a Cache-Control: max-age=300 (5 minutes) if the API/server knows that this data is updating more rapidly than other responses. This means that the queryCache thinks the data is fresh for much longer than than it should be and can lead to a confusing user-experience when users don't see updated information in the UI.

Accidental "cache/freshness doubling"

If you have a CDN in front of your API it's likely that the data you're receiving from the server is already "aged" even if it's not yet considered "stale". For example, if your server returns a response with the headers

X-Cache: HIT
Cache-Control: max-age=300; s-maxage=300
Age: 240

It would indicate that while the document is allowed to live for 300 seconds (5 minutes) before being considered stale, it has already been stored in the CDN for 240 seconds (4 minutes) so in reality once it gets rendered into the page it should only be considered "fresh" for 60 more seconds (roughly).

If I can only set staleness settings before the queryFn resolves, I might choose to make it staleTime: 300000 (5 minutes). If the response from my API returns the HIT from the cache it means that the effective "max age" that my query can be considered fresh for (by react-query) is effectively doubled to 10 minutes (5 minutes in the CDN cache and 5 minutes in react-querys queryCache)

NOTE: There is inconsistency with what response headers CDNs support/use to inform the client the age of the response (Expires, Age etc)

Using Cache-Control semantics

Using Cache-Control semantics (or a subset of directives) when storing items in the queryCache would make it possible to determine the appropriate Cache-Control settings of a server-side rendered page

For example: If I have a /dashboard page in my react application, and in order to resolve all the data on that page I need to make three different queries. They might return the following response headers.

<!--GET /api/v1/people-->
Content-Type: application/json
X-Cache: HIT
Cache-Control: max-age=1200; s-maxage=1200
Age: 1

<!--GET /api/v1/pets-->
Content-Type: application/json
X-Cache: HIT
Cache-Control: max-age=300; s-maxage=300
Age: 0

<!--GET /api/v1/news-->
Content-Type: application/json
X-Cache: MISS
Cache-Control: max-age=0; s-maxage=1500

Technically if I want to prevent my server-rendered react HTML pages getting cached if/when they contain stale data I should make the Cache-Control response headers of my rendered page to no larger than the smallest values of the cache-control settings for all the data that was used to construct the page, EG:

<!--GET /dashboard-->
Content-Type: text/html
Cache-Control: max-age=0; s-maxage=300

This would also extend to things like the Cache-Control: private directive, EG: If any of the http requests used to render the page contained a private directive in their Cache-Control then naturally the rendered HTML page should also be "private" (not stored in a shared cache) because it contains private data.

Having a "private" and "shared" cache provider

By default queries shouldn't be shared across requests, but there are some that are OK to share across multiple users/requests. Any query/request that specifies that it's safe to store in a shared cache eg Cache-Control: s-maxage=300 would also be safe to re-use the response of across requests to render the SSR app.

const assets = require(process.env.RAZZLE_ASSETS_MANIFEST);
const server = express();

// this cache is used across requests
const sharedQueryCache = makeQueryCache();

server
  .get('/*', (req, res) => {
    // this cache is used for only a single request
    const requestQueryCache = makeQueryCache(); 
    const markup = renderToString(
      <ReactQueryCacheProvider cache={queryCache} sharedCache={sharedQueryCache}>
      <App />
    </ReactQueryCacheProvider>
    );
    const cacheData = queryCache.serialize()
    const sharedCacheData = sharedQueryCache.serialize()
    res.send(
      `<!doctype html>
    <html lang="">
    <head>
        ${
          assets.client.css
            ? `<link rel="stylesheet" href="${assets.client.css}">`
            : ''
        } 
    </head>
    <body>
        <div id="root">${markup}</div>    
        <script src="${assets.client.js}" defer crossorigin></script>
    </body>
</html>`
    );
  });

@Ephem
Copy link
Collaborator Author

Ephem commented Jul 26, 2020

I'm glad you are interested in this issue!

Custom SSR

With custom SSR, the mechanisms to fill the cache will be either to prefetch the data using queryCatch.prefetchQuery (using something like react-router v6 preload or a custom solution), or rely on Suspense as the mechanism for fetching data from inside the component. I haven't had time to try out the React Query Suspense support on the server yet, but even though I think prefetching makes sense to avoid waterfalls I'm interested in this myself. I think it should work today with react-ssr-prepass or react-lightyear.

Either way, there will definitely be custom SSR examples and docs as well as Next-versions, I think the focus on getting it to work with Next is simply because that is more constrained than custom SSR, so its actually somewhat trickier to get right from a library perspective.

Having a "private" and "shared" cache provider

I agree this usecase is useful, but would be a bit hesitant about juggling two caches in the library implementation. Another way to go about this would be to pre-seed the queryCache you use per request from a separate cache that you keep between requests and that is shared. De/rehydration would be one mechanism that would let you do this.

Btw, there are more things to hammer out around this. Currently no timeouts are scheduled on the server, so data newer gets stale there. This is fine for per-request caches, but maybe not optimal for shared ones, or things like CLI-tools that also run in a node-environment.

Staleness based on headers or response data

While unrelated to this issue, these are very interesting suggestions! I suggest you go ahead and open a new issue/discussion for this so it doesn't go unnoticed in this closed one. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request merge pending
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants