Cache successfully parsed and validated documents for future requests. #2111

abernix · 2018-12-18T15:02:58Z

This PR builds upon the housekeeping work done in #2107, #2108 and #2109 and implements an in-memory cache store for parsed and validated documents in an effort to increase server performance.

Upon receiving a request, one of the many steps Apollo Server must perform is the parsing and validation of the document via graphql-js's parse and validate functions. The parsing and the validation of the document results in a DocumentNode (an AST) which is used for the eventual execution (via graphql-js's execute);

For relatively small documents, this takes a small fraction of time, though with more complex documents, the computational sink is becomes larger and less predictable.

These performance characteristics aren't always predictable, but when processing the same operation over and over again we can avoid the expensive computation cost associated with parsing and validating again by storing the resulting DocumentNode rather than building what is ultimately a re-forged copy of an effectively identical DocumentNode.

This PR introduces that technique by utilizing the SHA-256 hex digest of the operation document itself as a key and storing identical DocumentNodes in an InMemoryLRUCache (powered under-the-hood by the excellent lru-cache package!).

When using GraphQL variables (rather than string/numeric literals within the GraphQL document itself), this means that repeat operations — even with dynamically changing variables — can still be ready to execute based on the parsing and validation of a similar, previous invocation and cut a substantial amount of time off the request time round-trip.

Some benchmarks might help demonstrate this better. For the purposes of demonstrating this, I'll utilize relatively flat GraphQL documents (and corresponding schemas) which I benchmarked locally using Apache JMeter. The resolvers involved in this test are all just so-called "default resolvers", as in, they do nothing but return null and should accurately demonstrate the "win" from this optimization technique:

Benchmarks

p99 = 99th percentile.

Query	Without `DocumentNode` caching	With `DocumentNode` caching
`query Benchmark {` `Field_1` `}`	Avg.: 19.96ms p99: 56.00ms Throughput: 925.45	Avg.: 18.85ms ⚡️ p99: 58.00ms Throughput: 979.08 ⚡️
`query Benchmark {` `Field_1` `Field_2` `# ...until...` `Field_100` `}`	Avg.: 34.89ms p99: 74.00ms Throughput: 530.99	Avg.: 20.18ms ⚡️ p99: 55.99ms ⚡️ Throughput: 917.70 ⚡️
`query Benchmark {` `Field_1` `Field_2` `# ...until...` `Field_5000` `}`	Avg.: 954.87ms p99: 1007.00ms Throughput: 19.22	Avg.: 310.07ms ⚡️ p99: 390.00ms ⚡️ Throughput: 59.64 ⚡️

The single field operations are a wash but there are substantial improvements as the query document's complexity increases! Looking at the middle of the road (and more reasonable request) for 100 fields, the 99th percentile (as in, 99% of requests returned within that timeframe) is showing that it's 132% faster!

It's worth pointing out that I had a 1000ms request/response timeouts configured on the test-runner so the results on the "5000 fields - Before" test aren't showing the true depth of the situation since 88.10% of the operations Read timed out! (Of course, requesting 5000 top-level fields probably has a number of concerns so if that metric piques your concern you I certainly encourage you to dive in more!). On the plus side, the parsed and validated document did not timeout at all and, based on the average response time performed 308% faster. 🏎💨🎠

For anyone interested in more details, I've posted the more complete results at https://abernix.github.io/documentStoreBenchmarks, in the form of JMeter reports. You can click through to them from the index page.

Of course, there's still a couple things to be figured out here, particularly in the realm of configuration surface area. Currently, this utilizes a 30MiB cache, which might be enough for many/most configurations, but I'd like to figure out exactly what the sensible defaults might be here — and what might need tuning — before opening it up for configuration. In fact, if we don't need to add configuration, that would be even better!

I'm excited to see how this performs for others!

Fixes #1566

WIP

- apollo-cache-control@0.5.0-alpha.0 - apollo-datasource-rest@0.3.0-alpha.0 - apollo-datasource@0.3.0-alpha.0 - apollo-engine-reporting@0.3.0-alpha.0 - apollo-server-azure-functions@2.4.0-alpha.0 - apollo-server-cache-memcached@0.3.0-alpha.0 - apollo-server-cache-redis@0.3.0-alpha.0 - apollo-server-caching@0.3.0-alpha.0 - apollo-server-cloud-functions@2.4.0-alpha.0 - apollo-server-cloudflare@2.4.0-alpha.0 - apollo-server-core@2.4.0-alpha.0 - apollo-server-express@2.4.0-alpha.0 - apollo-server-hapi@2.4.0-alpha.0 - apollo-server-integration-testsuite@2.4.0-alpha.0 - apollo-server-koa@2.4.0-alpha.0 - apollo-server-lambda@2.4.0-alpha.0 - apollo-server-micro@2.4.0-alpha.0 - apollo-server-plugin-base@0.3.0-alpha.0 - apollo-server-testing@2.4.0-alpha.0 - apollo-server@2.4.0-alpha.0 - apollo-tracing@0.5.0-alpha.0 - graphql-extensions@0.5.0-alpha.0

abernix · 2018-12-18T15:16:32Z

For anyone interested in trying this out and providing feedback, this has been published under the apollo-server "next" tag as:

apollo-server@2.4.0-alpha.0

Integrations (e.g. express, hapi, koa) will also find their respective versions accordingly (e.g. apollo-server-express@2.4.0-alpha.0, apollo-server-hapi@2.4.0-alpha.0, apollo-server-koa@2.4.0-alpha.0, respectively; see 7697623 above for the full list).

glasser · 2018-12-18T19:45:20Z

packages/apollo-server-core/src/requestPipeline.ts

+    // If we still don't have a document, we'll need to parse and validate it.
+    // With success, we'll attempt to save it into the store for future use.
+    if (!document) {
+      const parsingDidEnd = await dispatcher.invokeDidStartHook(


Could there be any downsides for not calling parsingDidStart/DidEnd in this code path (and not providing some alternate callback to take its place)?

If someone has implemented critical parsingDidStart or parsingDidEnd functionality that didn't isolate its reach to that of the parsing phase, then they might find these callbacks are no longer invoked where they used to be.

Since we never documented or officially supported the graphql-extensions API and the request pipeline plugin hooks are new (though still undocumented. see #2008), I think the best we can do is stay true to the name of the methods and stand by the fact that parsing did not start or end; I don't think we should call parsingDidStart and parsingDidEnd if those behaviors did not actually happen.

Presumably, those who wanted behavior which occurred when execution started or ended would be using the executionDidStart or executionDidEnd hooks (and more generally, requestDidStart, requestDidEnd, willSendResponse, etc.).

I don't think there are any alternate callbacks that we can provide that will make this any easier, and I don't think such hooks would provide intrinsic value.

mxstbr · 2019-01-10T08:03:01Z

We have a bunch of really big queries for https://github.com/withspectrum/spectrum, so I'm excited about this change as it should provide a solid speedup for our use case!

Will see if we can get 2.4.0-alpha.0 into production and report back with some solid metrics 💯 EDIT: PRs open: withspectrum/spectrum#4533

This implements query parsing and validation caching, reference apollographql/apollo-server#2111, and should thusly provide a nice speedup for our frontend's queries!

mxstbr · 2019-01-14T06:34:35Z

Cross-posting from withspectrum/spectrum#4533: Reassessing after a couple days in prod, it seems to have helped @abernix!

p99 response time over:

Three days before deploying this: 250.78ms
Three days after deploying this: 230.65ms

👍

packages/apollo-server-core/src/requestPipeline.ts

packages/apollo-server-core/src/ApolloServer.ts

Ref: https://github.com/apollographql/apollo-server/pull/2111/files#r247616489

@glasser

… use. Without this change, the `document` property was not set on the `requestContext` for consumption by request pipeline plugins. To further guard against this oversight, I've removed the extra `document` variable which was being used as scoped state for the document and switched to directly using (and assigning to) the `requestContext.document`. Nice catch, @glasser! Ref: https://github.com/apollographql/apollo-server/pull/2111/files#r247617469

The parsed/validated cache store is on by default. While it could be disabled, in theory, it cannot be disabled since its an internal property of the request pipeline processor class. See confusion here: withspectrum/spectrum#4533 (comment)

Of my own making!

…ore`. While the implementation of the `documentStore` is currently simple enough to never throw (it is in-memory), it makes some sense to guard against future extended functionality where an exception might be raised. Since storing this object in a distributed memory store isn't currently feasible, I'm not sure what such an exception would be right now, but I don't mind being proactive! Ref: https://github.com/apollographql/apollo-server/pull/2111/files#r247618501

Previously, this used the `JSON.stringify` length, but this is slightly more aware.

…class. The implementation of object-size approximation which is used for cache eviction purposes in the `InMemoryLRUCache` implementation (via `lru-cache`) was a short-term location for extensible logic which is better located within `ApolloServerBase`. This is particularly important since future logic may necessitate knowing or understanding the current size (roughly, memory usage) of the in-memory storage. Effective immediately, this adds support for providing a `dispose` function which is called when an object is purged from the cache to make room for another.

In an effort to see how effective this cache is in production during this alpha phase, we'll print out the stats on the document store every 60 seconds.

- apollo-cache-control@0.5.0-alpha.1 - apollo-datasource-rest@0.3.0-alpha.1 - apollo-datasource@0.3.0-alpha.1 - apollo-engine-reporting@0.3.0-alpha.1 - apollo-server-azure-functions@2.4.0-alpha.1 - apollo-server-cache-memcached@0.3.0-alpha.1 - apollo-server-cache-redis@0.3.0-alpha.1 - apollo-server-caching@0.3.0-alpha.1 - apollo-server-cloud-functions@2.4.0-alpha.1 - apollo-server-cloudflare@2.4.0-alpha.1 - apollo-server-core@2.4.0-alpha.1 - apollo-server-express@2.4.0-alpha.1 - apollo-server-hapi@2.4.0-alpha.1 - apollo-server-integration-testsuite@2.4.0-alpha.1 - apollo-server-koa@2.4.0-alpha.1 - apollo-server-lambda@2.4.0-alpha.1 - apollo-server-micro@2.4.0-alpha.1 - apollo-server-plugin-base@0.3.0-alpha.1 - apollo-server-testing@2.4.0-alpha.1 - apollo-server@2.4.0-alpha.1 - apollo-tracing@0.5.0-alpha.1 - graphql-extensions@0.5.0-alpha.1

abernix · 2019-01-16T13:33:06Z

That's great news, @mxstbr!

I'm pretty close to merging this and iterating on this in subsequent follow-up work, but I think it's worth a last peak before doing so.

In particular, while those who have implemented this so far have had consistently improved response times to show for it, I've added an additional commit to this PR and cut another -alpha.1 release to gain some additional validation through data which will hopefully show that this cache store is sufficiently default-sized and that cache evictions are low.

Therefore, the versions indicated in the Publish commit above (33d1f82; e.g. apollo-server-express@2.4.0-alpha.1) will console.debug some statistics every 60 seconds which show the document store usage. e.g.:

[documentStore] 331408 / 31457280 (1%). Disposed of 0 documents this interval.

My intention is to remove this spammy-ness before merging, but I thought interested parties might want to take a look. So far, in our own dog-fooding on Apollo Engine, this seems to be working well as currently implemented.

… seconds." This reverts commit 7a0d0e6, as I intended when I originally introduced it. Ref: #2111 (comment)

- apollo-server-azure-functions@2.4.0-alpha.2 - apollo-server-cloud-functions@2.4.0-alpha.2 - apollo-server-cloudflare@2.4.0-alpha.2 - apollo-server-core@2.4.0-alpha.2 - apollo-server-express@2.4.0-alpha.2 - apollo-server-hapi@2.4.0-alpha.2 - apollo-server-integration-testsuite@2.4.0-alpha.2 - apollo-server-koa@2.4.0-alpha.2 - apollo-server-lambda@2.4.0-alpha.2 - apollo-server-micro@2.4.0-alpha.2 - apollo-server-plugin-base@0.3.0-alpha.2 - apollo-server-testing@2.4.0-alpha.2 - apollo-server@2.4.0-alpha.2 - graphql-extensions@0.5.0-alpha.2

abernix · 2019-01-23T13:08:36Z

As I mentioned I would in #2111 (comment), I've reverted 7a0d0e6 and published 2.4.0-alpha.2 without that commit.

I'm going to merge this PR into a release-2.4.0 branch which will be a staging grounds for anything else 2.4.0 related (#1971, I'm looking at you!).

Since we already write to the persisted query cache asynchronously (and intentionally do not await the Promise) this won't have any affect on the current behavior of when the persisted query cache is written to in the event of an executable request, but if an operation comes in and doesn't parse or validate, we'll avoid wasting cache space on an operation that will never execute. Additionally, in a similar light, if a plugin throws an error which stops execution, we can avoid the side-effect of writing to the persisted query cache. In terms of the APQ behavior, while this could cause additional round-trips for a client which repeatedly sends an invalid operation, that operation is never going to successfully finish anyway. While developer tooling will help avoid this problem in the first place, this refusal to store an invalid operation in the APQ cache could actually help diagnose a failure since the full operation (rather than just the SHA256 of that document) will appear in the browser's dev-tools on the retry. If we're looking to spare parsing and validating documents which we know are going to fail, I think that's going to be a better use of the (new) `documentStore` cache (#2111), since its in-memory and can accommodate a more complex data structure which includes the validation errors from the previous attempt which will, of course, be the same when validating the same operation against the same schema again. As the PR that introduced that feature shows, sparing those additional parses and validations (cached APQ documents still needs to be parsed and validated, currently) will provide a more successful performance win overall. Ref: #2111

* update background-jobs documentation * Fix chatInput focussing * Fix sending plaintext messages * Remove unnecessary double-read of attached media * Resolve React warnings * Fix creating new DM thread * Revert "Fix creating new DM thread" This reverts commit d8e9cc8. * Fix flow * Update background-jobs.md * Don't include the req path in the datadog key This is an anti-pattern, as it'll blow up our DataDog storage costs. Instead, we should be using a logging service like Splunk to dig into specific slow requests after noticing abnormalities in the metrics. Will tackle that next most likely, but this should be a good start! Thanks to the folks in #observability for pointing that out! * Dont track connection pool size in DataDog, track query response times and sizes * Properly flow-type statsd * Add statsd logging in dev * Log instance hostname * Upgrade apollo-cache-inmemory * Better network disabled styling on input * Fix styling if quote-replying and uploading media at same time * Dont use alpha version of new apollo inmemory * Add codesandbox to explore * Add codesandbox url regex for better embeds * Add packages for deploys * Add the rest * Update react-modal to version 3.8.1 * Update electron-builder to version 20.38.4 * Focus chatinput on quote message click * Fix media messages arriving after text messages * Handle websocket connection states * Track job queue metrics * Update react-transition-group to version 2.5.2 * Eliminate warning in console due to incorrect html element nesting * Update debug to version 4.1.1 * Update debug to version 4.1.1 * Update debug to version 4.1.1 * Update debug to version 4.1.1 * Update debug to version 4.1.1 * Update debug to version 4.1.1 * Update debug to version 4.1.1 * Update validator to version 10.10.0 * Update serialize-javascript to version 1.6.1 * Remove use of Google+ APIs * Downrank watercooler threads in digests * Nicer reputation string in digest emails * Add function to check if localStorage is allowed * Trigger login modal in the joinChannel upsell if no currentUser * Remove localStorage check * Make logged out users sign in before seeing chat input * Remove unused timeInUTC variable * Fix e2e tests * Better string, fix test snapshots * Update rimraf to version 2.6.3 * Update aws-sdk to version 2.383.0 * Update aws-sdk to version 2.383.0 * Update aws-sdk to version 2.383.0 * Update aws-sdk to version 2.383.0 * Update aws-sdk to version 2.383.0 * Add missing username property in onValidationResult object `onValidationResult` function takes an object with prop error, success and username. * Make username optional * Fix flow * Remove unused variables from chatInput * Explicitly target chat input from e2e tests * Move data-cy="chat-input" to actual input * Use .clear() instead of .type() in e2e tests * Remove .only in thread_spec, fix /new/thread tests * Upgrade to draft-js-import-markdown@latest to fix code blocks * Make plaintext message editing work * edit in plaintext on the frontend * Fix flow * Fix editing e2e test * Match input font weight to rendered message font weight * Remove invalid dataCy dom node property * Change editor to allow multiline messages * Remove empty line while editing * Fix incorrect focussing on chat input while editing message * Make single line breaks work! * Upgrade to apollo-server-express@2.4.0-alpha.0 This implements query parsing and validation caching, reference apollographql/apollo-server#2111, and should thusly provide a nice speedup for our frontend's queries! * add delete cover feature to edit community form * add coverphoto flow type * Add extra checks at send time for valid user recipient * Async await * Remove unnecessary code The code block for the `SET_SEARCH_VALUE` type and the `SET_SEARCH_VALUE_FOR_SERVER ` type are indentical. And, the `SET_SEARCH_VALUE` type isnt used in any action in the codebase. * Better keycode management * Enable flow in keycodes file * Fix flow * Focus the edit message input when edit message button is clicked * Move keycodes.js to src/helpers * Update source-map-support to version 0.5.10 * Update view error message * Fenced code blocks while editing * WIP: Implement GraphQL Rate Limiting * Make graphql-rate-limit work! * Show rate limit errors to end users * Adjust createChannel/createCommunity rate limits * Make flow happy * Use the job queue redis instance for rate limit data * Log rate limit errors to sentry in prod * Update to new version of gql-rate-limit, use string syntax * Make flow happy * Add flow stub for request-ip * Update react-transition-group to version 2.5.3 * Remove draftjs block type validation in messages This is confusing now that we have the plaintext input, and doesn't add anything as invalid blocks are treated as text blocks anyway on the frontend. * Disable rate limiting unless in prod * Gender-neutral rate limit error message, better addMessage rate limit * Update electron-builder to version 20.38.5 * Support hyperlinks in messages * Make flow happy * Update validator to version 10.11.0 * WIP: Add mention suggestions to chat input * add delete cover feature to createCommunityForm * Update validate-draft-js-input.js * Stop submitting message during IME is composing * Fix oopsie * Search users on type * It works! * Fix flow * Cleanup unnecessary code changes * Fix mentionsuggestions props * Fix dm chatinput onfocus and onblur handlers * Fix typo * Upgrade graphql-rate-limit This fixes a critical bug that is preventing some folks from posting messages * Update bull to version 3.5.3 * Update bull to version 3.5.3 * Update bull to version 3.5.3 * Update bull to version 3.5.3 * Update bull to version 3.6.0 * Update bull to version 3.6.0 * Update bull to version 3.6.0 * Update bull to version 3.6.0 * Hotfix Android thread creation * Fix hyperion memory leak Closes #4573 * Polish styles and prepopulate suggestions box * Update moment to version 2.24.0 * Update immutability-helper to version 2.9.1 * Show participants and search results in mention suggestions 1. Enter `@` 2. Participants are shown 3. Enter `@b` 4. Participants filtered and sorted by whether they have a `b` in them are shown 5. In the background, we do an API req 6. If there are users that come back from the API, we append them to the list of suggestions This is a much nicer experience than hiding the participants if you just want to filter them * Fix bug with null participant * Update graphql-tools to version 4.0.4 * Fix login redirects * Only show 10 suggestions, make sure suggestions have usernames * Pass participants to chatinput for SSR views * Darker shadow for contrast * Return original participants if user removes search query * If switched thread has no messages, remove participant suggestions * Ensure author is a suggested participant * Better styling, fewer suggestion results * Improves suggestion sort to factor in name * Fix capitalization for rendering and filtering * Fix flow * Ensure that local results will appear even if no remote results exist * Fix the default sort if search query isn't at the 0 index * Properly transform author username suggestion * Refactor to only perform suggestion transformations in one place * Provide transformation function in DM sorting * Fix flow * Workaround react-mentions bug with markup option Ideally, we would fix mentions by setting `markup="@__display__"` for the `MentionInput`. Unfortunately due to a bug in react-mentions, this doesn't work: signavio/react-mentions#299 This works around the issue by inserting `@[username]` into the value, and then removing the brackets before sending to the server! Closes #4587 * Add <MentionsInput /> component * Fix dataCy passthrough to mentionsinput * Update graphql-cost-analysis to version 1.0.3 * Update draft-js-export-markdown to fix escaped code chars * Update draft-js-import-markdown to fix link swallowing * Update draft-js-import-markdown to version 1.3.0 * Update highlight.js to version 9.14.1 * Update graphql-rate-limit to version 1.2.3 * Update aws-sdk to version 2.395.0 * Update aws-sdk to version 2.395.0 * Update aws-sdk to version 2.395.0 * Update slate to version 0.44.10

abernix · 2019-02-08T12:52:03Z

This has been released in Apollo Server 2.4.0!

e.g.

apollo-server-azure-functions@2.4.0
apollo-server-cloud-functions@2.4.0
apollo-server-cloudflare@2.4.0
apollo-server-express@2.4.0
apollo-server-hapi@2.4.0
apollo-server-koa@2.4.0
apollo-server-lambda@2.4.0
apollo-server-micro@2.4.0
apollo-server@2.4.0

stephenh · 2019-02-20T16:01:37Z

Just as an FYI to others that find this, we upgraded to apollo-server-express-2.4.0 and had to rollback due to OOMs; we usually run pretty lean on RAM, 600mb/container and use ~50% of it; with 2.4.0 (presumably b/c of this change) our RAM usage spiked to ~100%. Which is fine, we'll probably allocate more RAM to use this feature. Disclaimer we haven't taken heap dumps yet, so it could have been something else, but the apollo-server from ~2.2 to 2.4 was the only change in the release.

glasser · 2019-02-20T16:10:58Z

Huh, the intention is for this cache to take about 30MB. If you're seeing more, that sounds like a bug!

And also evidence that it's time to add the configurability @abernix alluded to, but more important to make sure there isn't just a bug in the size calculation or something...

stephenh · 2019-02-20T16:35:08Z

@glasser 30mb does sound much more reasonable. :-) I'll work on confirming the root cause but it probably won't be super quick just given current priorities.

abernix · 2019-02-20T16:42:29Z

@stephenh We've been running this in production ourselves for a while now and I haven't witnessed that problem, but I certainly wouldn't rule it out the possibility!

If you're able, I'm curious if you could try using 2.4.0-alpha.2, which as I eluded to in the comment above (#2111 (comment)) has hard-coded console.log-ging every 60 seconds which should expose the cache usage. The stability of that -alpha should be fine, but I'm also happy to cut you another release off the current master with that logging enabled if it's any use.

My hope in offering this suggestion is that it might be something you could quickly opt into, suffer the temporary consequences of the OOM and feed back that information to help identify if this is indeed at fault.

stephenh · 2019-02-20T17:39:47Z

@abernix sure, I can try 2.4.0-alpha.2, that sounds like a good idea. I'll have to jury rig a ~dark/canary type setup...likely manually register a machine into the ELB for a bit or something like that, so will take a bit to work on it in between other things.

shackpank · 2019-03-05T13:55:15Z

I'm experiencing the same issue here - seems like you may or may not encounter it based on client behaviour. In our case, there's a mutation like this, where the input will be different for every user, and the client is constructing the document with the input inline:

mutation {
  storeSomeId(id: "sajbfjkbdkfjbfk") { id }
}

Repeatedly running this mutation locally with a randomised input quickly fills the cache up; the debugging does indicate older documents are being dropped when the limit is reached, but the memory usage difference was more like 138 MB than 30 MB between an empty and full cache, and memory usage continues to rise (though more slowly; roughly 1MB per 2k requests - edit: comparing heapdumps suggests this is something internal to the lru-cache module and would likely need investigation over there) even after the cache is "full".

Assuming you control the client, the issue can be avoided (and much better use made of this cache) by rewriting the query like this to pass the input as a variable - this means the cache does not grow at all with additional requests:

mutation storeSomeId($id:String!) {
  storeSomeId(id:$id) { id }
}

abernix · 2019-03-07T12:22:58Z

@shackpank Thank you for sharing these helpful details and taking the time to look into this, particularly the follow-up edit which questions lru-cache's potential involvement. Super appreciated!

It sounds like there's two things:

Cache limit exceedance : slower growth, past the cache limit. Potentially something we're doing incorrectly, but also potentially something in lru-cache.
Cache misrepresentation: The "30MB" roughly-estimated cache limit (intention!), which mainly aims to facilitate ejection and prevent cache from becoming too full, may need to be re-analyzed. Measuring actual heap object size here would sure be nice, but JavaScript being JavaScript, we may just need another creative approach. This isn't configurable right now, and this lack of certainty of what the limit actually means in terms of memory usage is a good reason why. The overarching goal here is to just make sure that someone couldn't send so many unique documents to the server that it would fill up the cache.

I'm curious, when you say that the switching of string literals to variables:

means the cache does not grow at all with additional requests

...does this indicate that the cache limit exceedance also stopped? Or was it just no longer showing up on the radar? I'd expect this to be continuing regardless, if enough unique DocumentNodes (with or without variables) were being sent.

The lru-cache made a major bump to v5 in November via #2004. Maybe worth toggling that out with a v4 version? Is that something you could try? I somehow doubt this is something with that package — and more likely something we're doing — but worth digging around, I guess.

Since overall the cache is still bringing performance benefits even though it's exceeded it's intended size (read: in your case, 138MB) allotment, I'm actually mostly concerned about the slower growth, so hopefully we can get to the bottom of that. Still though, we'll need to adjust these expectations, and possible figure out another way to estimate AST sizes.

However, I'm glad you've discovered that string literals pose particular problems for cache benefits and that using GraphQL variables offers benefits in this department. The use of variables offers other benefits as well (e.g. aggregation purposes for metric reporting), so highly recommended!

sebnow · 2019-04-15T15:02:31Z

I can confirm the cache causes a memory leak. I wrote a script to generate random distinct queries and request them with wrk2. Taking a heap snapshot shows that it is retaining well above 30MB.

In a production environment I am noticing a ~30mb/hr increase in memory. After logging distinct queries, I have found that there are on average 4 unique queries (with literals) requested per second.

graphql-js: 14.2.1
apollo-server-core: 2.4.8
apollo-server-caching: 0.3.1

Cache successfully parsed and validated documents for future requests.

b30e933

WIP

abernix requested a review from martijnwalraven as a code owner December 18, 2018 15:02

abernix added 2 commits December 18, 2018 17:12

Add CHANGELOG.md for #2111 (parse/validate document store/cache).

88a294b

abernix mentioned this pull request Dec 18, 2018

feat(Fastify) Apollo server Fastify integration #626 #1971

Merged

4 tasks

glasser reviewed Dec 18, 2018

View reviewed changes

abernix added this to the Release 2.4.0 milestone Dec 19, 2018

martijnwalraven mentioned this pull request Jan 10, 2019

[RFC] Cache parsed and validated queries in memory #2170

Closed

8 tasks

mxstbr mentioned this pull request Jan 10, 2019

Upgrade to apollo-server-express@2.4.0-alpha.0 withspectrum/spectrum#4533

Merged

3 tasks

Merge branch 'master' into abernix/cache-parsed-validated

ea7da3a

glasser reviewed Jan 14, 2019

View reviewed changes

packages/apollo-server-core/src/requestPipeline.ts Outdated Show resolved Hide resolved

glasser reviewed Jan 14, 2019

View reviewed changes

packages/apollo-server-core/src/requestPipeline.ts Outdated Show resolved Hide resolved

glasser reviewed Jan 14, 2019

View reviewed changes

packages/apollo-server-core/src/requestPipeline.ts Outdated Show resolved Hide resolved

glasser reviewed Jan 14, 2019

View reviewed changes

packages/apollo-server-core/src/ApolloServer.ts Show resolved Hide resolved

abernix added 7 commits January 15, 2019 17:02

Fix incorrect contraction in code comment, per feedback.

45f14a2

Ref: https://github.com/apollographql/apollo-server/pull/2111/files#r247616489

Fix typos/spacing in commentary.

f546086

Of my own making!

Be considerate of variable length utf-8 encodings in InMemoryLRUCache.

0879a12

Previously, this used the `JSON.stringify` length, but this is slightly more aware.

abernix force-pushed the abernix/cache-parsed-validated branch from 2a987a9 to be71620 Compare January 16, 2019 10:47

abernix added 2 commits January 16, 2019 13:07

[debug] Print out documentStore stats every 60 seconds.

7a0d0e6

In an effort to see how effective this cache is in production during this alpha phase, we'll print out the stats on the document store every 60 seconds.

Revert (as intended!) "[debug] Print out documentStore stats every 60…

2ed1c0e

… seconds." This reverts commit 7a0d0e6, as I intended when I originally introduced it. Ref: #2111 (comment)

abernix changed the base branch from master to release-2.4.0 January 23, 2019 13:09

abernix merged commit 7d54688 into release-2.4.0 Jan 23, 2019

abernix deleted the abernix/cache-parsed-validated branch January 23, 2019 13:10

abernix mentioned this pull request Jan 23, 2019

Release 2.4.0 #2215

Merged

abernix mentioned this pull request Jan 25, 2019

Don't write to persisted query cache until execution will begin. #2227

Merged

abernix mentioned this pull request Feb 4, 2019

[apollo-server-core] Optionally Skip Query Validation? #722

Closed

abernix added a commit that referenced this pull request Feb 7, 2019

Adjust CHANGELOG.md entry for #2111.

44358ef

abernix mentioned this pull request Jul 3, 2019

Skip validation on automated persisted query cache hits #1566

Closed

ravangen mentioned this pull request Aug 13, 2019

Warning: Apollo Server >= 2.4 caches validation result pa-bru/graphql-cost-analysis#32

Open

ravangen mentioned this pull request Apr 12, 2020

Out-of-the-box protection from deep nested and complex queries attacks #1310

Closed

github-actions bot locked as resolved and limited conversation to collaborators Apr 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache successfully parsed and validated documents for future requests. #2111

Cache successfully parsed and validated documents for future requests. #2111

abernix commented Dec 18, 2018 •

edited

abernix commented Dec 18, 2018 •

edited

glasser Dec 18, 2018

abernix Jan 15, 2019

mxstbr commented Jan 10, 2019 •

edited

mxstbr commented Jan 14, 2019

abernix commented Jan 16, 2019

abernix commented Jan 23, 2019

abernix commented Feb 8, 2019

stephenh commented Feb 20, 2019

glasser commented Feb 20, 2019

stephenh commented Feb 20, 2019

abernix commented Feb 20, 2019 •

edited

stephenh commented Feb 20, 2019

shackpank commented Mar 5, 2019 •

edited

abernix commented Mar 7, 2019

sebnow commented Apr 15, 2019

Cache successfully parsed and validated documents for future requests. #2111

Cache successfully parsed and validated documents for future requests. #2111

Conversation

abernix commented Dec 18, 2018 • edited

Benchmarks

abernix commented Dec 18, 2018 • edited

glasser Dec 18, 2018

Choose a reason for hiding this comment

abernix Jan 15, 2019

Choose a reason for hiding this comment

mxstbr commented Jan 10, 2019 • edited

mxstbr commented Jan 14, 2019

abernix commented Jan 16, 2019

abernix commented Jan 23, 2019

abernix commented Feb 8, 2019

stephenh commented Feb 20, 2019

glasser commented Feb 20, 2019

stephenh commented Feb 20, 2019

abernix commented Feb 20, 2019 • edited

stephenh commented Feb 20, 2019

shackpank commented Mar 5, 2019 • edited

abernix commented Mar 7, 2019

sebnow commented Apr 15, 2019

abernix commented Dec 18, 2018 •

edited

abernix commented Dec 18, 2018 •

edited

mxstbr commented Jan 10, 2019 •

edited

abernix commented Feb 20, 2019 •

edited

shackpank commented Mar 5, 2019 •

edited