🔥 Feature: Add max size to cache #1892

dranikpg · 2022-05-03T20:37:22Z

This PR is a proposed solution to #1136 and allows limiting the cache size.

As stated in the issue, the cache should expire old entries to make room for new entries. To find old entries, it has to keep track of their expirations in a separate data structure.

Finding a solution

In general, the data structure has to support three main operations:

adding entries
removing arbitrary entries (in case of self-expiration)
removing the oldest entry (with lowest expiration point)

At first, it looks like a binary search tree would be the perfect fit. However GO's standart library doesn't offer much for implementing them and I didn't want to add new dependencies/piles of code. An easier and equally performant solution exists: lets use a binary heap and track node movements for arbitrary removals. This is what indexedHeap is for.

Benchmarks

Max size does not affect read performance in any way

Benchmark_Cache_MaxSize is a benchmark for three cases:

No max size specified (= no performance penalty)
A very large max size (= only insertions)
A small max size (= constant removal & insertion)

My results are as follows:

Benchmark_Cache_MaxSize/Disabled-16              1626 ns/op             349 B/op          7 allocs/op
Benchmark_Cache_MaxSize/Disabled-16              1588 ns/op             392 B/op          7 allocs/op
Benchmark_Cache_MaxSize/Unlim-16                 1812 ns/op             665 B/op          8 allocs/op
Benchmark_Cache_MaxSize/Unlim-16                 1678 ns/op             629 B/op          8 allocs/op
Benchmark_Cache_MaxSize/LowBounded-16            1178 ns/op             256 B/op          8 allocs/op
Benchmark_Cache_MaxSize/LowBounded-16            1190 ns/op             256 B/op          8 allocs/op

We're up one allocation from first to second case and about 10% slower per insertion. The third case is the fastest because our cache always stays small.

This is a benchmark where the handler takes zero time to execute. 10% might sound like much, but the difference is less than one microsecond in my case (thats actually suspiciously small 🤔). I doubt any handler call that has to be cached would care about such a time period.

Measuring real memory consumption

Instead of counting entries, we could count their body size to measure real memory consumption.

Other possible soltuions

We could ignore expiration all together and remove just the oldest entry by time of insertion. This would be the easiest and most performant solution. I'm not sure whether this would be correct behaviour (what if we pre-cache a bunch of big pages on startup - should we drop them all?)
Implement some garbage collection goroutine. This sounds more complex and is full of bad worst cases

welcome · 2022-05-03T20:37:23Z

Thanks for opening this pull request! 🎉 Please check out our contributing guidelines. If you need help or want to chat with us, join us on Discord https://gofiber.io/discord

ReneWerner87 · 2022-05-04T13:28:16Z

Thanks for this feature

will review it this weekend and if everything is ok will merge it

middleware/cache/config.go

middleware/cache/heap.go

middleware/cache/cache.go

dranikpg · 2022-05-04T17:05:27Z

I've just tracked down that one allocation 🥳
It comes from promoting a heapEntry to an interface{} when pushing onto the heap.

Pushing manually (append + heap.Fix) avoids it and performance is now almost on par. Looks kind of tricky though...

ReneWerner87 · 2022-05-04T17:47:29Z

I just remembered that we should use mutex, sync map or atomic due to the concurrency processes to prevent race conditions in the access

ReneWerner87 · 2022-05-04T17:53:13Z

Pls update the documentation for the middleware in our docs repository
update the config documentation in our README.md for thr middleware

dranikpg · 2022-05-04T17:59:04Z

I just remembered that we should use mutex, sync map or atomic due to the concurrency processes to prevent race conditions in the access

Sync is already in place for storage writes, heap modifications happen around it. There shouldn't be any issues

dranikpg · 2022-05-04T18:01:11Z

Pls update the documentation for the middleware in our docs repository

update the config documentation in our README.md for the middleware

Before touching docs: are we sure we'll count entries and not bytes?

The problem with bytes is that this is only an estimation. We should somehow account for internal values. For example, If the user stores entries in an external storage or the values are just very small, then the internal structures will blow up in size. That value would be an "upper bound on everything" no matter how and where it is stored.

ReneWerner87 · 2022-05-04T18:59:18Z

I just remembered that we should use mutex, sync map or atomic due to the concurrency processes to prevent race conditions in the access

Sync is already in place for storage writes, heap modifications happen around it. There shouldn't be any issues

Perfect

ReneWerner87 · 2022-05-04T19:00:48Z

Pls update the documentation for the middleware in our docs repository

update the config documentation in our README.md for the middleware

Before touching docs: are we sure we'll count entries and not bytes?

Will look tomorrow again more closely and answer

ReneWerner87 · 2022-05-05T11:51:47Z

@dranikpg
Ok then we should name the configuration flag differently

e.g. maxEntries

@efectn @hi019 what do you think about the count of bytes vs count of entries

efectn · 2022-05-05T11:58:09Z

@dranikpg Ok then we should name the configuration flag differently

e.g. maxEntries

@efectn @hi019 what do you think about the count of bytes vs count of entries

I think counting bytes is better option

dranikpg · 2022-05-06T09:40:12Z

middleware/cache/heap.go

+type heapEntry struct {
+	key   string
+	exp   uint64
+	bytes uint


Entry size is stored in the heap to update the total size without reading entries on delete

ReneWerner87 · 2022-05-10T06:49:45Z

@dranikpg thx
#1892 (comment)
don´t forget the doc repository

welcome · 2022-05-10T06:50:28Z

Congrats on merging your first pull request! 🎉 We here at Fiber are proud of you! If you need help or want to chat with us, join us on Discord https://gofiber.io/discord

Add new MaxBytes param to cache docs from gofiber/fiber#1892

* Cache middleware size limit * Replace MaxInt with MaxInt32. Add comments to benchmark * Avoid allocation in heap push. Small fixes * Count body sizes instead of entries * Update cache/readme

Cache middleware size limit

c2fae9c

efectn added the ✏️ Feature label May 4, 2022

Replace MaxInt with MaxInt32. Add comments to benchmark

826f410

ReneWerner87 reviewed May 4, 2022

View reviewed changes

middleware/cache/config.go Outdated Show resolved Hide resolved

ReneWerner87 reviewed May 4, 2022

View reviewed changes

middleware/cache/heap.go Show resolved Hide resolved

ReneWerner87 reviewed May 4, 2022

View reviewed changes

middleware/cache/cache.go Show resolved Hide resolved

Avoid allocation in heap push. Small fixes

ca1e6ca

Count body sizes instead of entries

855368f

dranikpg commented May 6, 2022

View reviewed changes

Update cache/readme

8554e53

efectn approved these changes May 9, 2022

View reviewed changes

ReneWerner87 approved these changes May 10, 2022

View reviewed changes

ReneWerner87 merged commit aa22928 into gofiber:master May 10, 2022

ReneWerner87 linked an issue May 10, 2022 that may be closed by this pull request

🚀 cache middleware: set maximum size of cache #1136

Closed

dranikpg added a commit to dranikpg/docs that referenced this pull request May 10, 2022

Update cache.md

b6312e4

Add new MaxBytes param to cache docs from gofiber/fiber#1892

dranikpg mentioned this pull request May 10, 2022

Update cache.md gofiber/docs#258

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🔥 Feature: Add max size to cache #1892

🔥 Feature: Add max size to cache #1892

dranikpg commented May 3, 2022

welcome bot commented May 3, 2022

ReneWerner87 commented May 4, 2022

dranikpg commented May 4, 2022 •

edited

ReneWerner87 commented May 4, 2022

ReneWerner87 commented May 4, 2022 •

edited

dranikpg commented May 4, 2022

dranikpg commented May 4, 2022 •

edited by ReneWerner87

ReneWerner87 commented May 4, 2022

ReneWerner87 commented May 4, 2022 •

edited

ReneWerner87 commented May 5, 2022

efectn commented May 5, 2022

dranikpg May 6, 2022

ReneWerner87 commented May 10, 2022

welcome bot commented May 10, 2022

Navigation Menu

🔥 Feature: Add max size to cache #1892

🔥 Feature: Add max size to cache #1892

Conversation

dranikpg commented May 3, 2022

Finding a solution

Benchmarks

Measuring real memory consumption

Other possible soltuions

welcome bot commented May 3, 2022

ReneWerner87 commented May 4, 2022

dranikpg commented May 4, 2022 • edited

ReneWerner87 commented May 4, 2022

ReneWerner87 commented May 4, 2022 • edited

dranikpg commented May 4, 2022

dranikpg commented May 4, 2022 • edited by ReneWerner87

ReneWerner87 commented May 4, 2022

ReneWerner87 commented May 4, 2022 • edited

ReneWerner87 commented May 5, 2022

efectn commented May 5, 2022

dranikpg May 6, 2022

Choose a reason for hiding this comment

ReneWerner87 commented May 10, 2022

welcome bot commented May 10, 2022

dranikpg commented May 4, 2022 •

edited

ReneWerner87 commented May 4, 2022 •

edited

dranikpg commented May 4, 2022 •

edited by ReneWerner87

ReneWerner87 commented May 4, 2022 •

edited