New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend the memo feature with optional parameters for expiring entries #1424
Conversation
``` @memo{ expire( seconds = 10, minutes = 20, // only one of the time based once should be set hours = 2, entries = 4000 ) } ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool and handy. it should help with memory efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me!
Nice added bonus is that Prelude now gets a standard way to call sleep
so I don't have to include separate code for that in my own Rascal projects anymore 😬
Why we discussed memoization in the past was for the need for an explicit way to programmatically clear them. I needed this to run some benchmarking of my Rascal code.
I don't see anything in this code that addresses this issue but (like discussed yesterday) I presume the previously added code to clear the caches is still there.
yeah, I'm not happy about that, but at least it's not a library function ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Thanks for the checks. I realized how we can use |
Hmmm, I would prefer less call overhead in this case since this is code which gets hit very often. So my vote would be to leave it as is at the moment. The more runtime speed we can get, the better imho :) |
I agree with Jouke; our call overhead in the interpreter is big enough 😉
…--
Jurgen J. Vinju
http://jurgen.vinju.org
On 5 Jun 2020, 16:14 +0200, Jouke Stoel ***@***.***>, wrote:
Hmmm, I would prefer less call overhead in this case since this is code which gets hit very often. So my vote would be to leave it as is at the moment. The more runtime speed we can get, the better imho :)
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
* A monotonic ticker that fast to frequently access and is updated roughly once every second | ||
* A monotonic ticker that fast to frequently access and is updated roughly once every second<br/> | ||
* <br/> | ||
* For more reasons why we want a fast time field, see <a href="http://pzemtsov.github.io/2017/07/23/the-slow-currenttimemillis.html">this discussion</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What an incredible thorough and nice blog post! TIL :)
Caffeine has an optional scheduler setting that cleans up automatically based on the next expiration time. This is paced to run at most every second to avoid excessive runs due to being off by nanos / millis. The sorted order is kept by using a timing wheel, which is an amortized O(1) priority queue. For eviction we wrap the value with list pointers to use the O(1) lru trick. Then max size is a poll of the list head and quick reorderings. However making it concurrent is where all the complexity lies. The allocation of the LookupKey per call can be removed by using a thread local. Unfortunately that doesn’t play well with gc’ing ClassLoaders so as a general library we can’t make assumptions about our user’s usage. I’m not sure if you can borrow much of the code, but I hope that offers optimization ideas. |
Hi @ben-manes, Always a pleasure to know you'll read along if we mention Caffeine. As you might know, we've replaced most of the caches in our code-base with caffeine ones, even a few instances of Using CaffeineI truly want to consider Caffeine instead of our own wrapper around
Regarding:
I was reading the code, and it seems there is quite some book keeping in I was trying to balance this Optimization Suggestions
|
Thanks @DavyLandman, You should definitely use the right tool for the job. Sometimes that's caffeine, but certainly not always. So please only take what I say as an experience report, not an attempt to influence whether to use Caffeine or not.
Yeah, I forget the details of why in Guava we decided to only support identity equality for weak/soft references. I believe it was because MapMaker had an I wouldn't be against adding (ben-manes/caffeine#344) but I am unclear on the tradeoffs, and the builder names have to be obvious & succinct (which seems harder than the code changes!).
Yes, though minor calls like if-conditions that don't apply so much of it is nanoseconds. Past benchmarks indicated that entry lock contention is the primary overhead, which leads us to have similar throughput as the underlying Some of the code complexity is to avoid calling a
Right, so this is why it's nice to have
At one point I had observed escape analysis kicking in to avoid the allocation, but it is too flaky to rely on. The idea originated in the linked discussions (ben-manes/caffeine#294) where an amp-agent seemed to run into allocation issues due to its high frequency and requires to be very low overhead. They were able to use my idea of thread local, but also point out why I couldn't apply that trick safely so it has remained backlogged. A high rate of short allocation will impact the GC, so being garbage free on read is desirable for a cache. Since they use bump pointer allocation, it is fast until threads contend trying to get new chunks and trigger pacing to allow the GC to catch up. Allocations can be observed as a bottleneck in very hot paths where escape analysis doesn't work, e.g. if crossing a thread boundy via a concurrent queue. In this case I don't consider it too concerning, so just take note of possible optimizations if you do observe it later on.
You probably want to call
Overall I am quite happy with how things turned out in Caffeine, given the constraints of a general purpose library. A targeted design like yours can leverage having more informed decisions and make nice optimizations that a general case cannot. Of course remember to benchmark, since that's almost always forgotten as many times those nice gains unfortunately don't materialize in practice. |
Based on our discussion, I've simplified the code quite a bit, I a single class now for lookup, and either store it inside a SoftReference when putting, or just raw for lookup. Simplified the code quite a bit. |
Even the postWrite task architecture stuff?
I'm putting the ThreadLocal in for now, will start some profiling & measurements in the future.
When time permits: always :) |
This PR does 2 things:
The only other reason why memo caches are cleared is due to running out of memory.
Future improvements:
maximumSize
can also take into account usage of entriesSince I know at least @joukestoel and @rodinaarssen have talked about this at some point, will you take a look before I merge into master?