New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance of workers limited by downlink bandwidth #20
Comments
This seems like a reasonable addition to me. |
The code is already there in https://github.com/streamcode9/abraxas/commit/fbb7be1e0f075a7257115432bead5597efe1e6a3 (see |
I cleaned up the changes, and put to Unfortunately the way it is implemented now, while showing excellent performance over WAN (I got about 3 ms overhead per job on batches of 1000 jobs over a bad 200 ms link), breaks support for multiple servers. The whole algorithm/protocol looks like this (in pseudocode):
It's rather complicated already, and with speculative GRAB_JOBs sent to multiple servers it will be even worse. |
Imagine that there is one worker that is heavy. I.e. it consumes much resources so it's not practical to run more than one worker.
In this case it is still beneficial to grab more than one job to fully utilize the connection. E,g. if one job (job_assign packet) is 10kb long, on 10 mbit connection with 25ms latency there should be 10 * 1024 * 1024 * 0.025 / (8 * 10 * 1024) = 4 packets in flight (after rounding up from 3.2)
My proposal is to have another control for job count.
maxJobs
controls how many jobs are executed concurrently, andmaxExtraJobsInFlight
(or a shorter name) controls, well, the extra jobs in flight.So in the example situation mentioned above, we will have
maxJobs = 1; maxExtraJobsInFlight = 4
The text was updated successfully, but these errors were encountered: