New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High latency when messages are not being send repeatedly #4673
Comments
Hi @plied , Btw, |
None of the above seem to make a big difference, I tried the following:
It seems like the |
Came across a similar issue which leads me to this issue. Could it be due to the processes are put to slept by the CPU scheduler and hence need to be re-scheduled for execution before the messages can be processed? This can affect both the local and remote processes IMO. One way to validate this hypothesis is to give the process high priority and use a real-time kernel. |
It could be something like that, however I did achieve super good and consistent performance using aeron on the same OS and kernel and without tweaking any scheduler configs. My hunch is that something within ZMQ itself is missconfigured but I cant find out what. |
Issue description
While building a ultra low latency application using ZeroMQ through IPC I noticed that even though the benchmark run with
perf/local_lat
andperf/remote_lat
is able to achieve sub 50us latencies within a same host, these results are not replicable in production. After hours of research I found out that if there is any timeout in between the calls to thezmq_sendmsg
method the following message will be sent with a huge latency (>200us).This means if the application is actually sending messages back to back it will behave well, however if there are breaks in between messages (which is a more realistic use case) the latency of the sent message increases dramatically.
This issue was partially identified in issues #3577 and #3560 but they didn't come up with a right way to reproduce and thus it wasn't solved.
I wonder if this could be caused by the sender thread losing priority if the
zmq_sendmsg
function is not called after some time?NOTE that I'm calling the deprecated
zmq_sendmsg
method just because the originalperf/remote_lat
also calls that.I was able to replicate this same issue in the Python client, Rust Client and C++ itself.
Environment
Minimal test code / Steps to reproduce the issue
perf/remote_lat.cpp
file so that it measures the roundtrip time of each message sent independently, and adds a delay after each roundtrip is finished and measured:and
What's the actual result? (include assertion message & call stack if applicable)
We keep getting very high latencies:
What's the expected result?
The original code which is exactly the same just without sleeping in between messages resulted in average latencies of below 50us every single time. I would expect latencies to behave similarly no matter if we are sending messages often or not, else the entire latency benchmark is missleading and useless.
The text was updated successfully, but these errors were encountered: