Cache should include the promise of matching active requests #1078

MrSwitch · 2020-02-19T12:27:30Z

What problem are you trying to solve?

Performance improvement multiple simultaneous requests with caching should await for the first to resolve and use the cached results of the first one.

Describe the feature

Contrived example but let's assume got is called simultaneously twice with the same link and a third time after they have resolved.

const A = got(link, {..., cache: map});
const B = got(link, {..., cache: map});

const [a, b] = await Promise.all([A, B]);
console.log('isFromCacheA', a.isFromCache); // false
console.log('isFromCacheB', b.isFromCache); // false

// Call again and we get `true`
const c = await got(link, {..., cache: map});
console.log('isFromCacheC', c.isFromCache); // true

As the outcome appears A and B are both not from the cache. Whilst C uses the cache. What i'd like to see ultimately is one request whose promise is stored in the cache map. And then awaited to serve the cached response.

To get around this one might have to maintain their own map to await on before making subsequent requests...

const map = new Map();
const promiseMap = new Map();
function gotCache(link) {
     if (promiseMap.get(link)) {
          await promiseMap.get(link);
     }
     const prom = got(link, {cache: map});
     promiseMap.set(link, prom);
     return prom;
}

sithmel · 2020-02-19T12:42:10Z

I have some experience in this as I have worked on this library, that implements a similar feature.
I built the dedupe function in async-deco to deal with that https://github.com/sithmel/async-deco

For in memory cache is straightforward. For distributed cache (redis), is definitely more tricky. The problem is that you will need a distributed locking (in async-deco I used redlock), and this could add an overhead to a request, that I haven't had the chance to measure yet.

szmarczak · 2020-02-19T20:42:21Z

I'll close this issue since it will be possible to do this as a part of the should accept different caching logic point. I have to say this is a good idea, it will save us some time (10ms+). I am even doing this in my http2-wrapper package.

sithmel · 2020-02-19T22:37:16Z

Just adding some extra thoughts.
In the case of multiple instances with distribute locking, you will completely lose any advantage with the latency of acquiring/releasing the lock. So I think it is a no no.

The reason why I mentioned the use of dedupe (async-deco) is because in reliable-get (the http get library I was mentioning above) we knew in advance whether we cache a resource. In that case you can dedupe the entire request, and this is a very effective optimisation.

In an rfc compliant caching such as GOT, you know if you should cache, only after getting a response. So basically you can't apply that optimisation.

If you use an in-memory cache you have no latency, so the optimisation is not effective.

If you use an external cache (redis for example) you have probably multiple node process, so the optimisation is only effective in the lucky event the request for the same resource lands on the same process.

So in my opinion I think it is not really worth it

szmarczak · 2020-02-20T06:22:21Z

Got will follow the RFC but people will be able to change the behavior if they would like to.

latency of acquiring/releasing the lock

This is not C++, Node.js is single-threaded. Node.js is blazing fast, its I/O is slow.

MrSwitch · 2020-02-24T19:31:48Z

Thanks @sithmel and @szmarczak

Reading up on the topic and looking at your examples. I have learnt a new term memoize. So it seems pretty simple enough to decorate got function for my needs where i dont have two many different requests. Of course an improved approach would likely use a keyv storage adaptor. But i really didn't look into those too much. Since the memoize decorator actually worked out best for me as i could use it higher up.

Thanks again

rfgamaral · 2020-02-26T15:39:22Z

Reading up on the topic and looking at your examples. I have learnt a new term memoize. So it seems pretty simple enough to decorate got function for my needs where i dont have two many different requests. Of course an improved approach would likely use a keyv storage adaptor. But i really didn't look into those too much. Since the memoize decorator actually worked out best for me as i could use it higher up.

If you could share a working example of that memoization, that would be great :)

MrSwitch · 2020-02-26T16:07:06Z

If you could share a working example of that memoization, that would be great :)

Here you go @rfgamaral https://github.com/5app/memoize

rfgamaral · 2020-02-27T16:41:33Z

Thank you :)

rfgamaral · 2020-02-29T20:26:38Z

Here you go @rfgamaral https://github.com/5app/memoize

@MrSwitch Just noticed that you could have used p-memoize or mem for this exact purpose. Any reason you have chosen not to? Just trying to understand my options here :)

MrSwitch · 2020-02-29T23:14:45Z

@rfgamaral i didn't know about those beforehand. They all do very much the same thing. I needed to set cache expiry on rejected and resolved Promise's differently in my app so i have defined an option useCache(cachedItem){} to customise that on implementation - sorry It's not really well documented.

Those others you reference look far more established than mine

rfgamaral · 2020-03-01T13:08:01Z

@MrSwitch Thank you for that and sorry to bombard you so many questions but "just one more" if you don't mind...

Are you using both memoization and Got's cache property? What exactly is your request/cache flow for both?

Or maybe you're just using memoization over Got's cache? It seems that Got will move in that direction with #875, right? If that's the case, what are the advantages of doing so?

MrSwitch · 2020-03-02T11:38:53Z

@rfgamaral that's the reason why i started this issue. Yes, i'm using both, but it's not ideal because memoize'ing effectively duplicates the cached response.

I would have liked to understand the cache adaptors in Got to see if there was a way to use them: However, based on my limited knowledge on the subject, the cache adaptor would be thwarted by:...

caches only occur after the response, not the pending request.
i'd have to mutate the cached response object to override the expiry headers... which sounds hacky.

The benefit of memoizing is it's a good all round tool that can be applied higher up the call stack.

I hope that helps

MrSwitch · 2020-03-02T11:44:33Z

#875 does sounds like it would make it possible.

rfgamaral · 2020-03-02T14:10:58Z

Personally, I'm trying to "replicate" RESTDataSource using Got with two features in particular:

Request deduplication to prevent double-invoking HTTP calls in rapid succession (memoization).
Resource caching of recent HTTP responses to avoid hitting internal APIs just to get the same data.

If I understand everything correctly, I believe I could rely on Got's built-in cache mechanism for resource caching and use mem for request deduplication (with a very short maxAge like 250ms or maybe a bit less).

If I implement both mechanisms correctly I can potentially get something like this:

Make a request with a memoized got;
If multiple requests were done in quick succession, return a memoized promise (if not expired);
If no memoized promise was is available, check Got internal cache for a response;
If a cached response is available, return that;
If no cached response is available, make a request to the origin endpoint and return the response (while adding the promise to memoized cache and the response to Got's internal cache).

What do you think?

MrSwitch · 2020-03-02T14:49:15Z

That's pretty much exactly what i'm doing for the happy path.

I added some extra stuff to handle exceptions https://github.com/5app/memoize/blob/master/README.md#options-memoizehandler-options

rfgamaral · 2020-03-02T14:51:42Z

What exactly do you mean with "handle exceptions"? What kind of exceptions?

MrSwitch · 2020-03-02T14:58:43Z

What exactly do you mean with "handle exceptions"? What kind of exceptions?

HTTP status codes: 4xx, 5xx responses, we wanted to cache those, but for few seconds rather than minutes.

rfgamaral · 2020-03-02T14:59:36Z

Alright, thanks for the clarification :)

MrSwitch mentioned this issue Feb 19, 2020

Enhance the cache option #875

Open

6 tasks

szmarczak closed this as completed Feb 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache should include the promise of matching active requests #1078

Cache should include the promise of matching active requests #1078

MrSwitch commented Feb 19, 2020

sithmel commented Feb 19, 2020

szmarczak commented Feb 19, 2020

sithmel commented Feb 19, 2020

szmarczak commented Feb 20, 2020

MrSwitch commented Feb 24, 2020

rfgamaral commented Feb 26, 2020

MrSwitch commented Feb 26, 2020

rfgamaral commented Feb 27, 2020

rfgamaral commented Feb 29, 2020 •

edited

MrSwitch commented Feb 29, 2020

rfgamaral commented Mar 1, 2020 •

edited

MrSwitch commented Mar 2, 2020 •

edited

MrSwitch commented Mar 2, 2020

rfgamaral commented Mar 2, 2020

MrSwitch commented Mar 2, 2020

rfgamaral commented Mar 2, 2020

MrSwitch commented Mar 2, 2020

rfgamaral commented Mar 2, 2020

Cache should include the promise of matching active requests #1078

Cache should include the promise of matching active requests #1078

Comments

MrSwitch commented Feb 19, 2020

What problem are you trying to solve?

Describe the feature

sithmel commented Feb 19, 2020

szmarczak commented Feb 19, 2020

sithmel commented Feb 19, 2020

szmarczak commented Feb 20, 2020

MrSwitch commented Feb 24, 2020

rfgamaral commented Feb 26, 2020

MrSwitch commented Feb 26, 2020

rfgamaral commented Feb 27, 2020

rfgamaral commented Feb 29, 2020 • edited

MrSwitch commented Feb 29, 2020

rfgamaral commented Mar 1, 2020 • edited

MrSwitch commented Mar 2, 2020 • edited

MrSwitch commented Mar 2, 2020

rfgamaral commented Mar 2, 2020

MrSwitch commented Mar 2, 2020

rfgamaral commented Mar 2, 2020

MrSwitch commented Mar 2, 2020

rfgamaral commented Mar 2, 2020

rfgamaral commented Feb 29, 2020 •

edited

rfgamaral commented Mar 1, 2020 •

edited

MrSwitch commented Mar 2, 2020 •

edited