DataSources: InMemory LRU Cache has Infinite maxSize #2252

nglgzz · 2019-01-31T15:13:52Z

By default, resource caching will use an in-memory LRU cache.

This is misleading considering that the default cache has a maxSize of Infinite. You can have any eviction policy, but if the cache size is infinite, you'll never evict any keys.

My suggestion is to either:

a) Update the docs saying that by default the cache has no memory limit
b) Set a reasonable maxSize, and update the function that calculates size of items, since currently it doesn't really make sense (length of item if it's a string or array, length of stringified object, or 1 in other cases).

I think the second option is the cleanest solution, and if you are okay with it I'll make a PR for it.

The text was updated successfully, but these errors were encountered:

abernix · 2019-01-31T15:20:57Z

You're right that it's currently not evicting anything.

In terms of the second half of your b) option, some of that's already been done via changes in the upcoming 2.4.0 release (#2215; specifically; https://github.com/apollographql/apollo-server/pull/2215/files#diff-3c78ac01fa2bb0ac00758ed1436212a6 and even more specifically via be71620). That will land pretty soon (I think Monday).

I'm curious what your thoughts are in terms of a reasonable maxSize for a RESTDataSource. It seems it's very likely that should be configured per data-source, particularly since the shape of their responses can vary so much.

nglgzz · 2019-01-31T17:11:24Z

I am not sure I understand the default size calculation.

function defaultLengthCalculation(item: any) {
  if (Array.isArray(item) || typeof item === 'string') {
    return item.length;
  }

  // Go with the lru-cache default "naive" size, in lieu anything better:
  //   https://github.com/isaacs/node-lru-cache/blob/a71be6cd/index.js#L17
  return 1;
}

According to this function foo and ['foo', 'bar', 'baz'] occupy the same size in cache. Is this expected behavior? Wouldn't the following be more accurate?

function defaultLengthCalculation(item: any) {
 if (typeof item === 'object') {
   return Buffer.byteLength(JSON.stringify(item));
 }

 if (typeof item === 'string') {
   return Buffer.byteLength(item);
 }

 return 1;
}

For maxSize I was thinking something not too big (some machines don't have much RAM), but still able to hold some decent amount of data. For instance the default memory limit for memcached is 64MB. I would use something around that value.

I don't think a max size should be set for each DataSource. Isn't it just one cache shared between different DataSources?

abernix · 2019-01-31T18:31:26Z

According to this function foo and ['foo', 'bar', 'baz'] occupy the same size in cache. Is this expected behavior?

I think this is the expected behavior for a generic LRU which is meant to be utilized in many places — not just RESTDataSource. The basic implementation should not take any liberties — including trying to approximate object size. It should still keep using a naive basic implementation but rely on individual implementors to provide their more exact rules. This is exactly the same way that the lru-cache that it's based on behaves. And lastly, the .length calculation makes a naive assumption that you're storing the same like-typed objects (not, say, a mixture of Arrays and Strings) in the store — which we are.

To be clear: The InMemoryLRUCache did not previously allow this configuration but that's precisely what the changes in be71620 bring. We've utilized the newly available sizeCalculator in this location.

I don't think a max size should be set for each DataSource. Isn't it just one cache shared between different DataSources?

There are multiple caches, one for each data source (see

apollo-server/packages/apollo-datasource-rest/src/RESTDataSource.ts

Lines 51 to 55 in d1f471d

    
           initialize(config: DataSourceConfig<TContext>): void { 
        
             this.context = config.context; 
        
             this.httpCache = new HTTPCache(config.cache); 
        
           }

), which I think is fair since each underlying endpoint may have varying dynamics which make it more or less important for it to be in the cache. For example, a local REST endpoint doesn't necessarily benefit as much from the LRU cache as a remote REST endpoint with poor performance dynamics.

This all to say, I think we should consider passing maxSize with a sensible default to InMemoryLRUCache via its construction in HTTPCache (And actually, using the default length behavior happens to be fine in this case since we're storing strings with String.prototype.length properties). I suspect this should be opened up as a DataSoureConfig option, but the naming could be confusing with an existing cache on there.

I'm curious what @martijnwalraven thinks.

ThiruvenkatamR · 2019-11-06T06:12:08Z

Hi,

I would to like add one more thing related to this. I am using apollo-rest-datasource to fetch data from a REST endpoint. I faced the same issue that requests are cached infinitly, I would like to add that even if the headers changes still we get the same response.

For example, if we get a list of data from an endpoint it is cache, this request had authorization header set to ABCD. Now, when we request the same endpoint with authorization header WXYZ we still get the same response. It doesn't even identify the change, it only checks if the endpoint is same. We use authorization headers for access related details, but we are not able to do it. What ever access a the person first time had is cached for everyone else.

Please let me know how to handle this

Thanks

nharlow89 · 2019-12-20T21:25:01Z

@ThiruvenkatamR It looks like the caching mechanism for RestDataSource is pretty basic and does not take into account the different authorization headers.

apollo-server/packages/apollo-datasource-rest/src/RESTDataSource.ts

Lines 58 to 65 in d1f471d

    
           // By default, we use the full request URL as the cache key. 
        
           // You can override this to remove query parameters or compute a cache key in any way that makes sense. 
        
           // For example, you could use this to take Vary header fields into account. 
        
           // Although we do validate header fields and don't serve responses from cache when they don't match, 
        
           // new reponses overwrite old ones with different vary header fields. 
        
           protected cacheKeyFor(request: Request): string { 
        
             return request.url; 
        
           }

I am not convinced that the following statement is accurate

we do validate header fields and don't serve responses from cache when they don't match

You may be able to solve your problem by implementing your own cacheKeyFor method and appending your authorization header somewhere in there, for example:

protected cacheKeyFor(request: Request): string { 
  if(request.headers.get('authorization'))
    return `${request.headers.get('authorization')}|${request.url}`; 
  return request.url
 }

ThiruvenkatamR · 2019-12-23T10:16:00Z

Thanks @nharlow89 Will check it out. Much appreciated

borekb · 2020-02-25T21:33:08Z

Just to make sure, to create a LRU cache for 100 items, I have to do this, right?

new ApolloServer({
  cache: new InMemoryLRUCache({
    maxSize: 100,
    sizeCalculator: () => 1,
  }),
});

It's a bit unintuitive that I have to provide the size calculator but not providing would probably eat all my cache with possibly a single string, right?

patrickwall57 · 2020-05-21T19:22:12Z

Would a default of 0MB coupled with an explanation in the docs on how to enable the cache be more reasonable than an unbounded cache? cries in out of memory exception

glasser · 2022-10-11T21:30:11Z

The class in question has moved from apollo-server-caching to @apollo/utils.keyvaluecache, and as part of that migration, the new version of the class is bounded by default.

abernix added the 🧬 data-sources label Aug 26, 2019

javiertury mentioned this issue Nov 1, 2020

Allow specifying a default ttl for InMemoryLRUCache #4709

Closed

glasser closed this as completed Oct 11, 2022

github-actions bot locked as resolved and limited conversation to collaborators Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataSources: InMemory LRU Cache has Infinite maxSize #2252

DataSources: InMemory LRU Cache has Infinite maxSize #2252

nglgzz commented Jan 31, 2019

abernix commented Jan 31, 2019

nglgzz commented Jan 31, 2019

abernix commented Jan 31, 2019 •

edited

ThiruvenkatamR commented Nov 6, 2019

nharlow89 commented Dec 20, 2019

ThiruvenkatamR commented Dec 23, 2019

borekb commented Feb 25, 2020

patrickwall57 commented May 21, 2020

glasser commented Oct 11, 2022

DataSources: InMemory LRU Cache has Infinite maxSize #2252

DataSources: InMemory LRU Cache has Infinite maxSize #2252

Comments

nglgzz commented Jan 31, 2019

abernix commented Jan 31, 2019

nglgzz commented Jan 31, 2019

abernix commented Jan 31, 2019 • edited

ThiruvenkatamR commented Nov 6, 2019

nharlow89 commented Dec 20, 2019

ThiruvenkatamR commented Dec 23, 2019

borekb commented Feb 25, 2020

patrickwall57 commented May 21, 2020

glasser commented Oct 11, 2022

abernix commented Jan 31, 2019 •

edited