Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reactive GraphQL Architecture #4687

Open
captbaritone opened this issue May 2, 2024 · 8 comments
Open

Reactive GraphQL Architecture #4687

captbaritone opened this issue May 2, 2024 · 8 comments
Labels

Comments

@captbaritone
Copy link
Contributor

Reactive GraphQL Architecture

This document outlines a vision for using GraphQL to model client data in applications which have highly complex client state. It is informed by the constraints of developing applications for the web, but should be applicable to native applications as well.

GraphQL provides a declarative syntax for application code to specify its data dependencies. While GraphQL was designed for facilitating query/response communications between clients and servers, it has also proved a useful mechanism for implementing client-side data loading from non-GraphQL servers. Implementing your client-side data layer as a GraphQL executor enables decoupling product code from the code which fetches data from a REST server. The GraphQL resolver architecture also provides an opinionated way to model the data layer, forcing it to be implemented in composable fashion, where the GraphQL executor is responsible for composing the individual resolvers together to derive all the needed data for a product surface.

Historically, this architecture is most often encountered in products where the front-end team sees value in the developer experience of GraphQL, but organizational or technical impediments prevent implementing the GraphQL executor on the server. However, we are starting to see other types of applications where this architecture makes sense for purely technical reasons. Examples include:

  • Applications which deal with end-to-end encrypted data, where the data is opaque until it's on the client where it can be decrypted
  • "Local First" applications where a client-side database is the source of truth for all UI and communication with the server (if any) is mediated through that layer

While implementing a GraphQL executor on the client can be an attractive architecture from a developer experience perspective, it creates a number of challenges in terms of efficiency. The rest of this post will describe a proposed evolution of this architecture which preserves its benefits while mitigating many of its challenges.

The Architecture

  • The GraphQL schema is written in an implementation-first style which allows a compiler to infer GraphQL schema from the names and type annotations used in the implementation. See Grats for an idea of what this could look like. This also allows the compiler to understand exactly what code is needed to resolve a given query.
  • The GraphQL resolvers have the option to be reactive, returning streams/observables of values representing that value over time rather than simply returning the current value
  • The GraphQL execution is (optionally) performed off the main thread. For example using a Web Worker or even a shared web worker.
  • Responses, and subsequent updates, from the GraphQL executor are provided to the main thread in a normalized form, rather than as a JSON object matching the shape of the GraphQL operation. This moves potentially expensive normalization work off the main thread, while also enabling efficient patch updates.
  • Product code is presented with a composite schema composed of the data available from the GraphQL server, if one exists, as well as the client-defined schema. Product code should not need to think about the distinction. The compiler and client GraphQL executor are responsible for optimally fetching/computing the needed data.
  • The GraphQL client on the main thread (e.g. Relay) operates using a normalized cache of the same shape generated by the executor enabling efficient propagation of data updates to the UI.

While the architecture is not prescriptive about any actual tools, Relay, with its compiler and generated code, is well positioned to explore this architecture. Ideally it can eventually be decomposed into distinct tools:

  • Implementation-first GraphQL schema authoring tool
  • Reactive GraphQL executor
  • GraphQL client that accepts "live" normalized responses

Problems Solved

  • Abstracts away the fact that client data is dynamic and thus could change at any time
  • Makes updates from the data layer efficient (O(changed data) as opposed to O(GraphQL operation size))
  • Enables efficient bundling where only the subset of the GraphQL resolvers reachable from a given operation need to be included in the bundle
  • Abstracts away the distinction between server and client data for product code enabling a single declarative API for reading data. This enables seamlessly moving resolver implementation between client and server without needing to touch client code
  • Client-defined schema can be defined and implemented without needing to also manually specify GraphQL schema (SDL)

Benefits Preserved

  • Provides an opinionated architecture for the client data layer which enables it to be fully decoupled from application code
  • Provides a single, discoverable, tooling compatible schema for product code to discover what data is available
  • The loading/preloading/computing of all data needed to render a given surface can be initiated independently of needing to actually render the surface
  • Data sent between the main and worker threads can be efficiently serialized/deserialized with JSON thanks to GraphQL's well-defined serialization semantics

Open Questions

We are still early in exploring this architecture and some open questions remain:

  • Most likely requires a reactive data store or some abstraction on top of a non-reactive data store which allows it to appear reactive. We may also find we also find ourselves needing reactive versions of other layers that exist in GraphQL servers: ORM? Ent? DataLoader?
  • The tradeoffs of complexity/efficiency/memory use of the reactive GraphQL executor that returns normalized responses are not yet well understood. I suspect an efficient implementation exists here, but it's a complicated problem with many tradeoffs to explore.
  • Are there viable migrations strategies to incrementally adopt this architecture when coming from other existing setups?
  • While the compiler can build a module that imports only the resolver code needed for a specific query, it's unclear how we coordinate code loading between the main thread and worker thread. The main thread bundle knows which operation(s) its going to dispatch. Somehow that needs to also trigger the worker thread fetching the code to execute those operations, ideally in parallel with the main thread loading its code.
  • What should the semantics of mutations be in the context of a reactive GraphQL executor? Specifically, the response portion of the mutation is generally used to specify which updates the client would like to observe, but with a reactive executor we already expect to be notified of changes to any data we are currently observing.

Collaboration Opportunities

  • Our early work to infer GraphQL schema from typed JavaScript is currently limited to Flow. An external contributor could provide equivalent mappings from SWC or Oxc's AST to our Rust AST.
  • Exploration of incremental migrations strategies that could be viable. One approach that might be relevant is outlined here. In general this problem is complicated due to the interconnectedness of a GraphQL schema. You either need a type-safe way to tie their execution back together, or a way to partition the graph.

Sources

These ideas have been explored across various projects:

@alloy
Copy link
Contributor

alloy commented May 3, 2024

What should the semantics of mutations be in the context of a reactive GraphQL executor? Specifically, the response portion of the mutation is generally used to specify which updates the client would like to observe, but with a reactive executor we already expect to be notified of changes to any data we are currently observing.

In a world where we'd want to ideally eliminate all updaters, optimistic updates will need to be handled by the local data-store layer. Would this become entirely a concern of the application, or do you imagine Relay would still play a role in this?

@alloy
Copy link
Contributor

alloy commented May 3, 2024

Are there viable migrations strategies to incrementally adopt this architecture when coming from other existing setups?

This includes:

  • Existing graphql-js based resolvers
  • Existing app code that uses another GraphQL client

@flow-danny
Copy link

flow-danny commented May 14, 2024

I was actually thinking about doing a little experiment implementing a Network layer fetchQuery that doesn't really fetch over HTTP, but gets data from local SQLite.

It would do this by running graphql resolvers as if it's a graphql server, avoiding all intermediate serialization.

After every commit from network updates, a crude way to make it reactive, could be to invalidate the Store?

@captbaritone
Copy link
Contributor Author

In a world where we'd want to ideally eliminate all updaters, optimistic updates will need to be handled by the local data-store layer. Would this become entirely a concern of the application, or do you imagine Relay would still play a role in this?

Still an open question! Perhaps the answer is that you'll be able to do either? Some client data layers may have their own optimistic state mechanism. That would probably have the ability to be more robust. A higher ceiling.

Conversely some will not, in which case Relay's primitive which make sense for server data should still be available.

@captbaritone
Copy link
Contributor Author

captbaritone commented May 16, 2024

I was actually thinking about doing a little experiment implementing a Network layer fetchQuery that doesn't really fetch over HTTP, but gets data from local SQLite.

I did a prototype of something similar to this using Relay Resolvers which you can find here: https://relay.dev/docs/next/guides/relay-resolvers/introduction/

By using Relay Live Resolvers (experimental feature) you can invalidate values at a field or record granular level. For a crude start I just invalidated every value on every db update. But something like https://github.com/vlcn-io/cr-sqlite could probably get you something much more sophisticated.

@flow-danny
Copy link

With a generic entity schema it will be easy to know which Node IDs are invalid, but is there currently a way to invalidate only the active queries currently rendering those nodes?

@flow-danny
Copy link

flow-danny commented May 16, 2024

I also looked at Relay resolvers, but they work so differently to normal resolvers... tied to fragments instead of the schema itself.

Skipping all the networking and JSON back and forth, a regular graphql-tools resolver will already give you subscriptions, which is basically reactive.

I'm sure its possible to stitch a server schema in there and have the client-side resolver do a regular network fetch.

The resolver could also be compiled using something like graphql-jit to reduce overhead.

@captbaritone
Copy link
Contributor Author

I also looked at Relay resolvers, but they work so differently to normal resolvers... tied to fragments instead of the schema itself.

Sorry for the confusion. We've been expanding Relay Resolvers to enable them to model arbitrary arbitrary client state with field-level reactivity. I've just merged a PR which add documentation for this experimental feature. You can read more here: https://relay.dev/docs/next/guides/relay-resolvers/introduction/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants