improve performance #214

Uzlopak · 2022-07-13T15:16:47Z

Will provide benchmarks later if requested, but it is about 10 % faster (before 2600 k requests/s, after 2800k requests/s)
It would be nice, if you give me feedback regarding the code.

test/vary.test.js

vary.js

Co-authored-by: Matteo Collina <matteo.collina@gmail.com>

Uzlopak · 2022-07-18T14:52:49Z

@mcollina
Please take a look

Uzlopak · 2022-07-18T15:08:14Z

I focused on what happens when we parse again and again the vary header. I assume that the vary header is generated deterministic. So I use caching and when having more than one value for vary, it will be still fast. On my machine if I hit it with autocannon it would fall down from 2,6 mio requests to 0,17 mio requests. Now it falls from 2.6 mio requests to about 2 mio requests.

vary.js

Uzlopak · 2022-07-18T21:53:07Z

@mcollina

I am very proud of this now. Its simple, its fast, its sexy. :D

My autocannon-benchmarks show that this is about 2.8 mio requests with no vary header manipulation, 2.6 mio requests with one vary header manipulation, 2.4 mio requests with two vary header manipulations and 2.2 mio requests with three vary header manipulations and so on.

Uzlopak · 2022-07-19T08:23:02Z

@mcollina
@jsumners
@Eomm

Does it make sense to create a new package @fastify/vary to use this code also in @fastify/sensible ?

mcollina

lgtm

mcollina · 2022-07-20T06:59:31Z

why do we need vary in sensible?

Uzlopak · 2022-07-20T07:14:44Z

@mcollina

Sensible also uses vary.

https://github.com/fastify/fastify-sensible/blob/92139a8433b376294f6a3eee51ab38cdb3b3c872/package.json#L39

mcollina · 2022-07-20T07:51:51Z

It's just exposed to the end users. The API of what was added in here it's not the same of vary module so it's not a 100% swap.

vary.js

add benchmarks

Uzlopak · 2022-07-20T09:56:14Z

I added benchmarks:

vary - field to * x 11,460,799 ops/sec ±0.65% (190 runs sampled)
vary - * to field x 8,176,435 ops/sec ±0.92% (187 runs sampled)
vary - field to empty x 6,908,442 ops/sec ±0.77% (188 runs sampled)
vary - fields string to empty x 8,653,009 ops/sec ±0.60% (189 runs sampled)
vary - field to fields x 3,660,382 ops/sec ±0.69% (189 runs sampled)

cors - field to * x 76,832,035 ops/sec ±1.12% (187 runs sampled)
cors - * to field x 34,669,727 ops/sec ±0.95% (186 runs sampled)
cors - field to empty x 27,483,166 ops/sec ±0.51% (191 runs sampled)
cors - fields string to empty x 33,894,136 ops/sec ±1.82% (179 runs sampled)
cors - field to fields x 334,247 ops/sec ±0.19% (194 runs sampled)

mcollina

lgtm

Uzlopak · 2022-07-20T10:31:34Z

While running the benchmarks, I realize that the case cors - field to fields is 10 times sllower. I will investigate this further.

Sry :(

climba03003

While running the benchmarks, I realize that the case cors - field to fields is 10 times sllower. I will investigate this further.

The Request Change here is blocking to land based on the message above.
Let me know when the slow case fix and I will dismiss the Request Change.

Uzlopak · 2022-07-20T14:30:33Z

kind of feels strange to use caching that much. I also use mnemonist because it is obviously faster. I would have preferred the regex but well... Anyway.

vary - field to undefined x 7,106,800 ops/sec ±1.04% (188 runs sampled)
vary - field to * x 11,269,425 ops/sec ±1.12% (192 runs sampled)
vary - * to field x 8,844,989 ops/sec ±0.79% (191 runs sampled)
vary - field to empty x 7,321,003 ops/sec ±1.03% (188 runs sampled)
vary - fields string to empty x 8,739,100 ops/sec ±1.11% (191 runs sampled)
vary - field to fields x 4,053,665 ops/sec ±0.46% (192 runs sampled)
cors - field to undefined x 79,317,181 ops/sec ±1.11% (189 runs sampled)
cors - field to * x 65,132,203 ops/sec ±1.17% (187 runs sampled)
cors - * to field x 40,600,000 ops/sec ±1.87% (175 runs sampled)
cors - field to empty x 37,364,825 ops/sec ±0.92% (188 runs sampled)
cors - fields string to empty x 40,388,831 ops/sec ±1.45% (176 runs sampled)
cors - field to fields x 18,774,807 ops/sec ±0.70% (184 runs sampled)

jsumners · 2022-07-21T14:20:01Z

package.json

@@ -45,7 +45,7 @@
  },
  "dependencies": {
    "fastify-plugin": "^4.0.0",
-    "vary": "^1.1.2"
+    "mnemonist": "^0.39.2"


This seems like a really heavy dependency. It also isn't at a stable release number.

Uzlopak · 2022-07-21T14:38:17Z

@jsumners

tiny-lru is slower than mnemonist

vary:

vary - field to undefined x 7,098,732 ops/sec ±0.75% (190 runs sampled)
vary - field to * x 11,084,521 ops/sec ±0.93% (191 runs sampled)
vary - * to field x 7,882,434 ops/sec ±0.69% (188 runs sampled)
vary - field to empty x 6,949,372 ops/sec ±0.77% (189 runs sampled)
vary - fields string to empty x 8,385,889 ops/sec ±0.70% (189 runs sampled)
vary - field to fields x 3,548,317 ops/sec ±0.71% (191 runs sampled)

mnemonist:

cors - field to undefined x 108,076,176 ops/sec ±2.34% (167 runs sampled)
cors - field to * x 87,009,887 ops/sec ±1.46% (175 runs sampled)
cors - * to field x 79,040,526 ops/sec ±0.40% (193 runs sampled)
cors - field to empty x 76,961,478 ops/sec ±0.53% (191 runs sampled)
cors - fields string to empty x 77,507,047 ops/sec ±0.46% (191 runs sampled)
cors - field to fields x 62,389,119 ops/sec ±1.21% (176 runs sampled)

tiny-lru:

cors - field to undefined x 106,751,509 ops/sec ±1.82% (171 runs sampled)
cors - field to * x 88,892,481 ops/sec ±1.48% (178 runs sampled)
cors - * to field x 78,700,246 ops/sec ±0.46% (191 runs sampled)
cors - field to empty x 77,239,556 ops/sec ±0.52% (189 runs sampled)
cors - fields string to empty x 76,970,667 ops/sec ±0.61% (191 runs sampled)
cors - field to fields x 27,534,170 ops/sec ±0.55% (191 runs sampled)

It is still about 10 times faster than original vary. So if you want I can use tiny-lru

mcollina · 2022-07-21T14:40:21Z

mnemonist is faster than tiny-lru in all possible cases: https://github.com/dominictarr/bench-lru.

kibertoad · 2022-08-03T09:40:41Z

I suggest pinning mnemonist at patch version, since it's not at 1.0.0 yet, so I expect breaking changes in minor versions.

Uzlopak · 2022-08-03T09:43:32Z

@kibertoad
pinned

Uzlopak · 2022-08-03T11:10:47Z

@climba03003

You want to review again, please?

kurtextrem · 2022-08-09T07:48:03Z

out of curiosity, did anyone benchmark other LRU implementations? (if so, I don't wanna repeat the work)
mainly asking, because e.g. https://github.com/lukeed/flru is a muuuch smaller dep. (it has a benchmark at the bottom, so maybe those are also worth benchmarking). Here's even a native addon, which might also improve perf of the LRU: https://github.com/adzerk/node-lru-native
I'm curious if the new implementation of the parsing is faster than https://github.com/jshttp/vary and if it makes then sense to make a PR there as well so everyone benefits from this speedup?
Apart from this, what is the expected memory increase of using the LRU? This could be very well important for some users.

Uzlopak · 2022-08-09T15:28:16Z

@kurtextrem

I am actually thinking alot about making a general decision regarding a LRU by fastify so that we use everyhere the same LRU
I dont think that we should upstream anything to vary. The caching is something I think would not pass by their maintainers. And tbh I dont expect caching in modules from jshttp-org.
I dont think that it has any increadible memory increase. The Vary header is something very deterministic set by the server. So I strongly assume that in production we actually cache about 20 entries in the lru-cache. Its not like we produces masses of different vary headers on the fly.

kurtextrem · 2022-08-13T13:16:23Z

@Uzlopak
Regarding point 1: Do you have any discussion to point me to? In any case, maybe this section is helpful for making decisions for fastify: https://github.com/isaacs/node-lru-cache#performance

Regarding point 2: what I meant is, this PR changes the algorithm of parsing the Vary header in general (excluding the LRU Cache here). The two implementations totally differ now, and I wonder which implementation is faster, since the new implementation benchmarks "new algorithm + LRU" and not "only new algorithm". So what if "old algorithm" (from the vary npm module) is faster with the LRU, and if it isn't, then pushing the new algorithm to the package might make sense.

Regarding point 3: In this case, I think we reduce the 1000 reserved slots by default (https://github.com/fastify/fastify-cors/pull/214/files#diff-6e4256d6cce0fe10bc30208d12a22415f2db8cefe0aeebf6fa2830e590539894R70)

Uzlopak · 2022-08-13T14:06:30Z

The only thing we used vary was to add the necessary key to the vary field was not already set. Also of course the special case of *.

I dont even think that it would actually hurt to have duplicate keys in vary.

So the only chokepoint is to determine if the key is already in the provided string. I think without caching we have already some noticable gains. I think maybe i can squeeze more perf...

We could reduce cacheSize to 100 if you like. 100 like in
https://github.com/fastify/fastify/blob/37ae4ea1f7490e8194e42b1bf85a9db60f048c57/lib/contentTypeParser.js#L40

I am actually kind of unopinionated regarding lru caching. I am

improve performance

ab45639

mcollina reviewed Jul 14, 2022

View reviewed changes

test/vary.test.js Outdated Show resolved Hide resolved

test/vary.test.js Outdated Show resolved Hide resolved

vary.js Outdated Show resolved Hide resolved

Uzlopak and others added 2 commits July 14, 2022 12:03

Apply suggestions from code review

ff03d9c

Co-authored-by: Matteo Collina <matteo.collina@gmail.com>

improve performance

fd83fbf

use originalHeader for caching

dfde177

Uzlopak commented Jul 18, 2022

View reviewed changes

vary.js Outdated Show resolved Hide resolved

Uzlopak added 2 commits July 18, 2022 23:37

use regex magic

7d446ac

revert test, ensure backwars compatibility

96917a3

also check fieldName on empty header

b640924

mcollina approved these changes Jul 20, 2022

View reviewed changes

mcollina requested changes Jul 20, 2022

View reviewed changes

vary.js Show resolved Hide resolved

add more unit tests

53537c6

add benchmarks

Uzlopak added 3 commits July 20, 2022 12:05

disable package-lock generati

5e2fc12

extract benchmark related code to benchmark package.json

68aecda

improve performance

a6a8939

mcollina approved these changes Jul 20, 2022

View reviewed changes

mcollina requested a review from Eomm July 20, 2022 10:15

climba03003 requested changes Jul 20, 2022

View reviewed changes

stabilize performance

49d7314

other approach

4a4d34e

jsumners reviewed Jul 21, 2022

View reviewed changes

Eomm approved these changes Jul 26, 2022

View reviewed changes

Uzlopak requested review from climba03003 and jsumners July 26, 2022 11:59

Fdawgs approved these changes Aug 3, 2022

View reviewed changes

pin version of mnemonist

740c422

climba03003 approved these changes Aug 3, 2022

View reviewed changes

kibertoad approved these changes Aug 3, 2022

View reviewed changes

mcollina merged commit f523476 into fastify:master Aug 3, 2022

Uzlopak deleted the perf branch August 9, 2022 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve performance #214

improve performance #214

Uzlopak commented Jul 13, 2022 •

edited

Uzlopak commented Jul 18, 2022

Uzlopak commented Jul 18, 2022

Uzlopak commented Jul 18, 2022

Uzlopak commented Jul 19, 2022

mcollina left a comment

mcollina commented Jul 20, 2022

Uzlopak commented Jul 20, 2022

mcollina commented Jul 20, 2022

Uzlopak commented Jul 20, 2022

mcollina left a comment

Uzlopak commented Jul 20, 2022

climba03003 left a comment •

edited

Uzlopak commented Jul 20, 2022

jsumners Jul 21, 2022

Uzlopak commented Jul 21, 2022

mcollina commented Jul 21, 2022

kibertoad commented Aug 3, 2022

Uzlopak commented Aug 3, 2022

Uzlopak commented Aug 3, 2022 •

edited

kurtextrem commented Aug 9, 2022 •

edited

Uzlopak commented Aug 9, 2022

kurtextrem commented Aug 13, 2022 •

edited

Uzlopak commented Aug 13, 2022

improve performance #214

improve performance #214

Conversation

Uzlopak commented Jul 13, 2022 • edited

Uzlopak commented Jul 18, 2022

Uzlopak commented Jul 18, 2022

Uzlopak commented Jul 18, 2022

Uzlopak commented Jul 19, 2022

mcollina left a comment

Choose a reason for hiding this comment

mcollina commented Jul 20, 2022

Uzlopak commented Jul 20, 2022

mcollina commented Jul 20, 2022

Uzlopak commented Jul 20, 2022

mcollina left a comment

Choose a reason for hiding this comment

Uzlopak commented Jul 20, 2022

climba03003 left a comment • edited

Choose a reason for hiding this comment

Uzlopak commented Jul 20, 2022

jsumners Jul 21, 2022

Choose a reason for hiding this comment

Uzlopak commented Jul 21, 2022

mcollina commented Jul 21, 2022

kibertoad commented Aug 3, 2022

Uzlopak commented Aug 3, 2022

Uzlopak commented Aug 3, 2022 • edited

kurtextrem commented Aug 9, 2022 • edited

Uzlopak commented Aug 9, 2022

kurtextrem commented Aug 13, 2022 • edited

Uzlopak commented Aug 13, 2022

Uzlopak commented Jul 13, 2022 •

edited

climba03003 left a comment •

edited

Uzlopak commented Aug 3, 2022 •

edited

kurtextrem commented Aug 9, 2022 •

edited

kurtextrem commented Aug 13, 2022 •

edited