New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch applying events to kqueue #449
base: unstable
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #449 +/- ##
============================================
+ Coverage 70.17% 70.21% +0.03%
============================================
Files 109 109
Lines 59904 59904
============================================
+ Hits 42039 42059 +20
+ Misses 17865 17845 -20 |
Since we don't automatically test on FreeBSD: https://github.com/valkey-io/valkey/actions/runs/8994018097 |
Thanks, I see it's succeeded. Just out of curiosity, why didn't we set up a macOS runner that would run by default? |
Runner execution time was the historical reason, we didn't want to burn resources on running all tests, since we have a daily set of tests that are slightly more comprehensive. |
kqueue has the capability of batch applying events. This PR implements this functionality for kqueue with which we're able to reduce plenty of system calls of `kevent(2)`. --------- Signed-off-by: Andy Pan <i@andypan.me> --------- Signed-off-by: Andy Pan <i@andypan.me>
Hi @madolson, any new thoughts on this? |
@panjf2000 It makes sense to me, I just haven't had a sense to walk through it and convince myself the code is correct. Hopefully tomorrow? |
Sure, take your time. I was asking only to ensure that you'd like to continue this reveiwing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect this to make the code more performant?
--------- Signed-off-by: Andy Pan <i@andypan.me>
This PR is expected to conserve plenty of extra system calls to |
Would you mind using |
--------- Signed-off-by: Andy Pan <i@andypan.me>
Environment
Benchmark command# With RDB and AOF disabled on the server.
valkey-benchmark -r 10000000 -d 100 -n 10000000 -q Benchmark resultsvalkey-io:unstablePING_INLINE: 141031.80 requests per second, p50=0.175 msec
PING_MBULK: 164306.14 requests per second, p50=0.159 msec
SET: 160627.09 requests per second, p50=0.191 msec
GET: 156184.11 requests per second, p50=0.175 msec
INCR: 161022.81 requests per second, p50=0.167 msec
LPUSH: 164573.83 requests per second, p50=0.167 msec
RPUSH: 164098.53 requests per second, p50=0.167 msec
LPOP: 163655.41 requests per second, p50=0.167 msec
RPOP: 162211.27 requests per second, p50=0.167 msec
SADD: 160130.66 requests per second, p50=0.167 msec
HSET: 151906.42 requests per second, p50=0.191 msec
SPOP: 158737.72 requests per second, p50=0.175 msec
ZADD: 127665.01 requests per second, p50=0.319 msec
ZPOPMIN: 158866.33 requests per second, p50=0.175 msec
LPUSH (needed to benchmark LRANGE): 164546.77 requests per second, p50=0.167 msec
LRANGE_100 (first 100 elements): 86594.33 requests per second, p50=0.311 msec
LRANGE_300 (first 300 elements): 36351.34 requests per second, p50=0.663 msec
LRANGE_500 (first 500 elements): 22212.60 requests per second, p50=0.751 msec
LRANGE_600 (first 600 elements): 19722.66 requests per second, p50=0.839 msec
MSET (10 keys): 54174.70 requests per second, p50=0.839 msec
XADD: 157696.38 requests per second, p50=0.175 msec panjf2000:kqueue-batchPING_INLINE: 159212.86 requests per second, p50=0.159 msec
PING_MBULK: 184396.38 requests per second, p50=0.143 msec
SET: 173722.70 requests per second, p50=0.207 msec
GET: 179742.97 requests per second, p50=0.159 msec
INCR: 182832.06 requests per second, p50=0.159 msec
LPUSH: 174146.25 requests per second, p50=0.151 msec
RPUSH: 174794.62 requests per second, p50=0.151 msec
LPOP: 173250.17 requests per second, p50=0.151 msec
RPOP: 175260.27 requests per second, p50=0.151 msec
SADD: 181957.12 requests per second, p50=0.159 msec
HSET: 174380.08 requests per second, p50=0.175 msec
SPOP: 186985.80 requests per second, p50=0.159 msec
ZADD: 139279.66 requests per second, p50=0.303 msec
ZPOPMIN: 182648.41 requests per second, p50=0.151 msec
LPUSH (needed to benchmark LRANGE): 180554.31 requests per second, p50=0.151 msec
LRANGE_100 (first 100 elements): 93919.64 requests per second, p50=0.287 msec
LRANGE_300 (first 300 elements): 37977.62 requests per second, p50=0.639 msec
LRANGE_500 (first 500 elements): 23096.98 requests per second, p50=0.719 msec
LRANGE_600 (first 600 elements): 20357.48 requests per second, p50=0.815 msec
MSET (10 keys): 54659.74 requests per second, p50=0.823 msec
XADD: 170558.23 requests per second, p50=0.159 msec |
Ok, those performance numbers are compelling, although I'm a little surprised at the result. During the typical command flow, we shouldn't call any of these functions, so I'm surprised to see so much improvement. |
The benchmark for commands of retrieving data with ./valkey-benchmark -r 10000000 -d 1024 -n 10000000 -P 100 -t get,lpop,rpop,spop,zpopmin -q valkey-io:unstable: GET: 3898635.50 requests per second, p50=1.151 msec
LPOP: 3837298.50 requests per second, p50=1.183 msec
RPOP: 3752345.25 requests per second, p50=1.215 msec
SPOP: 3979307.50 requests per second, p50=1.135 msec
ZPOPMIN: 3790750.50 requests per second, p50=1.199 msec panjf2000:kqueue-batch: GET: 4093327.75 requests per second, p50=1.095 msec
LPOP: 3971406.00 requests per second, p50=1.143 msec
RPOP: 4127115.00 requests per second, p50=1.095 msec
SPOP: 4152824.00 requests per second, p50=1.095 msec
ZPOPMIN: 3977724.75 requests per second, p50=1.143 msec |
Ok, so I'm still worried about the panic and how it might impact production systems for edge cases related to kqueue failures. I would like to get the performance improvement, but don't want that to come at the cost of potential failures. Would you be happy with this being some type of build flag so that folks have to opt-in to it at build time? |
I'm OK with that. But I'm not sure how you plan on doing that. The first way I can think of is to add a new configuration in |
Another approach could be Makefile flags, something like |
--------- Signed-off-by: Andy Pan <i@andypan.me>
@madolson Please check out the latest commit. |
@panjf2000, do you mind counting the number of
|
Not only do we register and unregister events for connections and disconnections, we also do that with As for the statistics of the system calls to |
--------- Signed-off-by: Andy Pan <i@andypan.me>
Benchmark command./valkey-benchmark -r 10000000 -d 1024 -n 10000000 -P 100 -t get,lpop,rpop,spop,zpopmin -q Branch: valkey-io:unstableCALL COUNT
…
accept 64
fcntl 174
setsockopt 263
kevent 1227
read 22255
write 22257 GET: 3496503.50 requests per second, p50=1.223 msec
LPOP: 3617945.00 requests per second, p50=1.191 msec
RPOP: 3617945.00 requests per second, p50=1.191 msec
SPOP: 3553660.50 requests per second, p50=1.207 msec
ZPOPMIN: 3673769.50 requests per second, p50=1.175 msec Branch: panjf2000:kqueue-batchCALL COUNT
…
accept 67
fcntl 174
setsockopt 263
kevent 903
read 21085
write 21114 GET: 3834355.75 requests per second, p50=1.135 msec
LPOP: 3815337.50 requests per second, p50=1.151 msec
RPOP: 3675119.50 requests per second, p50=1.207 msec
SPOP: 3707823.50 requests per second, p50=1.191 msec
ZPOPMIN: 3732736.25 requests per second, p50=1.159 msec |
--------- Signed-off-by: Andy Pan <i@andypan.me>
To make it clearer, by "update the kqueue batch handler", you meant removing the |
Or, did you mean that we need to fix #528 by changing the return type in the function signatures of |
Yeah, this one. I think we can start with the |
Great! Then I guess we can continue the code review here? @madolson |
kqueue
has the capability of batch applying events:This PR implements this functionality for
kqueue
with which we're able to reduce plenty of system calls ofkevent(2)
.