Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks against moize #29

Open
benderTheCrime opened this issue Feb 1, 2017 · 18 comments
Open

Benchmarks against moize #29

benderTheCrime opened this issue Feb 1, 2017 · 18 comments

Comments

@benderTheCrime
Copy link

It looks like moize claims to be faster than fast-memoize in terms of ops/s and based on whatever tests @planttheidea is running. Can you refute or substantiate this?

for reference: https://github.com/planttheidea/moize#benchmarks

@planttheidea
Copy link
Collaborator

creator of moize here, figured I'd chime in.

I crafted my words carefully when writing my README. In benchmarks on my various machines (an old laptop, my work desktop, and my primary dev machine at home), yes ... I was faster than fast-memoize in both single- and especially multi-parameter memoization. This is not to be dismissed, but not seen as gospel, because benchmarks inherently will vary from machine to machine. moize does not try to be the fastest period ... it tries to be very fast while maintaining ease of use for a majority of common use cases. There can be no fastest because some benchmark on some machine will inevitably prove you wrong, and that kind of quibbly inherently makes the results suffer because winning benchmarks become more important than actual applications, which defeats the purpose.

Just figured I would try to set a baseline to prevent any chest-thumping, because I don't care about necessarily being the fastest ... it just kind of happened that way (for now, someone will beat it later).

@benderTheCrime
Copy link
Author

benderTheCrime commented Feb 1, 2017 via email

@planttheidea
Copy link
Collaborator

planttheidea commented Feb 1, 2017

I'm not saying those invalidate my benchmarks, far from it ... but your question and follow-up statement appears to believe in the concept of a definitive answer, and I was just pointing out that any such a guarantee is likely not to hold much weight.

I recommend forking the repos and running the benchmarks yourself. It is a two minute process that will yield tangible results, and then you can be the judge yourself.

@caiogondim
Copy link
Owner

@planttheidea Thanks for jumping in.

About being the fastest
There is a lot of misinterpretation when people read that fast-memoize.js is the fastest memoize in javascript. Most of them interpret it in an arrogant way. Maybe it's my fault as a non native english speaker. What I mean when I say that is that I will try to keep the library up to date with the fastest algorithm, and not because I'm the best ever programmer in the world.

About benchmarks
You have some interesting benchmarks with more cases, I will incorporate them.

About open source
I will run the benchmarks and if your algorithm is faster, I will incorporate in fast-memoize.js, learn more about JavaScript, give you the credit, and release it.
Hope it's ok for you, since you're using the MIT license.

If you want to help on the project, I gave you commit access to the project.
That would be cool to work with you.

@caiogondim
Copy link
Owner

Will keep this issue open until I incorporate moize in the benchmark.

@planttheidea
Copy link
Collaborator

planttheidea commented Feb 2, 2017

@caiogondim thanks, always nice to find someone that likes to work together instead of complete. 😸

I can say that the biggest optimization is in the caching mechanism used. I created a custom implementation of a barebones Map class, where the order of retrieval is updated automatically on an LRU basis on both get and set and also keep the most recent object in a separate lastItem instance value. I then also place a check on all get actions to see if the key requested is the last item. This places a bias on the most recently used argument, which greatly improves benchmarks but more importantly optimizes for the real-life scenario of rarely-changing values. The auto-LRU maintenance also has the benefit of facilitating the maxSize option, which will drop the oldest item in cache once the cache size limit is reached.

There are other optimization points (leveraging closures via partial application functions, for example), but that custom Map class is really where I saw a massive boost. Hope that helps.

@caiogondim
Copy link
Owner

Thanks for the lesson =)
I will read your code, run the benchmark and try to incorporate some of your ideas with all credits to you.

Will keep you posted.

Thanks a lot @planttheidea 😄

@anywhichway
Copy link
Contributor

One of the things you will find is that performance can vary dramatically across web browsers. Oddly, even on the same machine we frequently find browsers run faster than NodeJs. See http://www.jsbenchmarks.com/?anywhichway/memoize/master/benchmark.js for browser comparisons.

@qm3ster
Copy link

qm3ster commented Feb 25, 2019

@anywhichway any suspicion how browsers end up being faster?
I've had that observation many times, and it worries me.

@dantman
Copy link

dantman commented Feb 26, 2019

From what I can see the performance difference between fast-memoize and moize should be because they are based on fundamentally different memoization strategies. Which have different pros/cons and behaviours.

fast-memoize uses the classic string cache key memoization strategy style where the args are serialized to a string that is used as a cache key.

moize by default uses a variant of the map/equality based memoization strategy where instead of serializing the args to a string cache key the args are tested in some method by equality to the argument values of previous invocations.

There are of course other notable memoization strategies. Like the "one-value" style memoize-one uses and the WeakMap based style of memoize-weak and one of the memoize-immutable options.


Lately I've been lamenting the fact that we have several different memoization libraries using several different strategies. Each strategy has pros and cons that make it optimal for different scenarios. I actually had a project where I was making use of 2-3 different memoization libraries because different parts of the code needed different types of memoization. And I've been wishing the authors of most of the different memoization libraries could agree to move everything into a single memoization library that incorporates all the different memoization strategies and lets you choose between them. e.g. memoize(byCacheKey()), memoize(byEquality()), memoize(byWeak()), etc I suppose.

Sorry for the minor hijack. But I'm wondering what @caiogondim and @planttheidea think about this.

@planttheidea
Copy link
Collaborator

@dantman no disagreement, it's about the best tool for the job.

The serialization technique fast-memoize uses does have use cases, which is why I kept a serialize option for moize (moize.serialize), but its specific use case is less frequent than strict or even deep equality, which is why it's not the default on moize.

A WeakMap based approach sounds good (no need to manually remove from cache to clean up references), but LRU handling becomes more costly, and therefore so do explicitly-limited cache sizes. In reality the same memory/cleanup savings can be achieved partly by having cache entries that expire.

Really what you should be asking is why you are bringing in the separate libraries for the same goals. Can I accomplish my goals with fast-memoize and a custom cache? Can I do it with moize and in certain cases apply it's serialize option? Once you understand why you think you need different libraries, you can try to whittle it down to one by leveraging the full capability of the one that works best for your use cases.

@anywhichway
Copy link
Contributor

@dantman I like your thoughts here. If I were going to use one as a foundation for the type of thing you describe it would be fast-memoize. Of all the memoizers I have used, fast-memorize seems to have the cleanest architecture and code, although there are a number of open issues.

My memoizer, https://github.com/anywhichway/nano-memoize, tried to address your concerns a little by choosing the strategy that seemed to be best for the situation. On the other hand, it was only written in response to an initial the claim of fast-memoize to be the "fastest possible memoizer", which it only seems to be in one case. And, the final optimization that made it that way (avoiding a pre-lookup), was arguably counter to what some of its users expect. Read about half-way down here: #65. Since launch, the the author has acknowledged it was probably an error to lead off an the article that way.

Note, nano-memoize has some pretty gnarly code and would not be appropriate for generalization.

@dantman
Copy link

dantman commented Feb 26, 2019

A WeakMap based approach sounds good (no need to manually remove from cache to clean up references), but LRU handling becomes more costly, and therefore so do explicitly-limited cache sizes.

Why do you think you need LRU handling for WeakMap based memoization? The weak reference traits get rid of most of the reasons you might want a LRU for.

@planttheidea
Copy link
Collaborator

@dantman for bias toward recency. Memoizers that have a cache of caches, like those being discussed, bias towards most recent calls because the expectation of a memoized function is that it changes parameters infrequently. By biasing towards recency, you make the likely use case of "called with the same parameters as last time" faster. I know I used the term LRU which is often about clearing out the oldest items in cache, but the same ordering mechanism LRU uses is used for recency bias as well. Apologies for the confusion.

@dantman
Copy link

dantman commented Feb 27, 2019

@planttheidea Ok, interesting use case. I'm mixed about how I feel about the side effects of a recency cache. But in what way is this more costly to do with WeakMaps / how does it negate the significant advantages of a WeakMap cache?

@anywhichway
Copy link
Contributor

anywhichway commented Feb 27, 2019 via email

@planttheidea
Copy link
Collaborator

@dantman I don't want to derail this thread. Originally you talked about needing to use multiple libraries because they need different kinds of memoization, and the point I was getting across was you probably could just use a single library more effectively.

There are several out there with a wide variety of options ... I tried to make moize usable for a myriad of use cases, but there are several others out there. You may like memoizee, for example ... It's not as performant in the common use cases, but it's very robust and even has a WeakMap implementation.

@dantman
Copy link

dantman commented Feb 27, 2019

Originally you talked about needing to use multiple libraries because they need different kinds of memoization, and the point I was getting across was you probably could just use a single library more effectively.

Ok. Didn't know memoizee did weak-strategy in addition to cache key strategy. However there isn't really any single memoization library that is clean to use and covers all the memoization strategies I might need. Last project I used memoize-one when I needed the "only-one" style of memoization, memoize-immutable+Map for fixed size single arg stuff, and memoize-immutable+WeakMap/MixedTupleMap/WeakTupleMap when I needed weak reference memoization.

Like I mentioned, I really wish we could have one library with all the strategies (not just 1-2 like most) and a good API where it's easy to use all the strategies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants