-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve performance #214
improve performance #214
Conversation
Co-authored-by: Matteo Collina <matteo.collina@gmail.com>
@mcollina |
I focused on what happens when we parse again and again the vary header. I assume that the vary header is generated deterministic. So I use caching and when having more than one value for vary, it will be still fast. On my machine if I hit it with autocannon it would fall down from 2,6 mio requests to 0,17 mio requests. Now it falls from 2.6 mio requests to about 2 mio requests. |
I am very proud of this now. Its simple, its fast, its sexy. :D My autocannon-benchmarks show that this is about 2.8 mio requests with no vary header manipulation, 2.6 mio requests with one vary header manipulation, 2.4 mio requests with two vary header manipulations and 2.2 mio requests with three vary header manipulations and so on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
why do we need vary in sensible? |
It's just exposed to the end users. The API of what was added in here it's not the same of |
add benchmarks
I added benchmarks:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
While running the benchmarks, I realize that the case Sry :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While running the benchmarks, I realize that the case cors - field to fields is 10 times sllower. I will investigate this further.
The Request Change here is blocking to land based on the message above.
Let me know when the slow case fix and I will dismiss the Request Change.
kind of feels strange to use caching that much. I also use mnemonist because it is obviously faster. I would have preferred the regex but well... Anyway.
|
package.json
Outdated
@@ -45,7 +45,7 @@ | |||
}, | |||
"dependencies": { | |||
"fastify-plugin": "^4.0.0", | |||
"vary": "^1.1.2" | |||
"mnemonist": "^0.39.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a really heavy dependency. It also isn't at a stable release number.
tiny-lru is slower than mnemonist vary:
mnemonist:
tiny-lru:
It is still about 10 times faster than original vary. So if you want I can use tiny-lru |
mnemonist is faster than tiny-lru in all possible cases: https://github.com/dominictarr/bench-lru. |
I suggest pinning mnemonist at patch version, since it's not at 1.0.0 yet, so I expect breaking changes in minor versions. |
@kibertoad |
You want to review again, please? |
|
|
@Uzlopak Regarding point 2: what I meant is, this PR changes the algorithm of parsing the Regarding point 3: In this case, I think we reduce the 1000 reserved slots by default (https://github.com/fastify/fastify-cors/pull/214/files#diff-6e4256d6cce0fe10bc30208d12a22415f2db8cefe0aeebf6fa2830e590539894R70) |
The only thing we used vary was to add the necessary key to the vary field was not already set. Also of course the special case of *. I dont even think that it would actually hurt to have duplicate keys in vary. So the only chokepoint is to determine if the key is already in the provided string. I think without caching we have already some noticable gains. I think maybe i can squeeze more perf... We could reduce cacheSize to 100 if you like. 100 like in I am actually kind of unopinionated regarding lru caching. I am |
Will provide benchmarks later if requested, but it is about 10 % faster (before 2600 k requests/s, after 2800k requests/s)
It would be nice, if you give me feedback regarding the code.