-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: DPLPMTUD #1903
base: main
Are you sure you want to change the base?
feat: DPLPMTUD #1903
Conversation
As far as I can tell the trait function Does that make sense @larseggert? |
Benchmark resultsPerformance differences relative to b900860. coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [196.26 ns 196.72 ns 197.21 ns] change: [-0.4267% -0.0721% +0.2810%] (p = 0.70 > 0.05) Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe coalesce_acked_from_zero 3+1 entries: Change within noise threshold.time: [237.02 ns 237.73 ns 238.43 ns] change: [-1.6460% -1.2165% -0.8542%] (p = 0.00 < 0.05) Found 19 outliers among 100 measurements (19.00%) 15 (15.00%) high mild 4 (4.00%) high severe coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [236.56 ns 236.90 ns 237.37 ns] change: [-1.1836% -0.4907% +0.3412%] (p = 0.26 > 0.05) Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [218.37 ns 218.57 ns 218.79 ns] change: [-6.2263% -2.4119% -0.1027%] (p = 0.18 > 0.05) Found 14 outliers among 100 measurements (14.00%) 3 (3.00%) low mild 3 (3.00%) high mild 8 (8.00%) high severe RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [119.04 ms 119.16 ms 119.28 ms] change: [+0.1833% +0.3087% +0.4264%] (p = 0.00 < 0.05) transfer/Run multiple transfers with the same seed: 💚 Performance has improved.time: [48.990 ms 49.276 ms 49.442 ms] thrpt: [80.903 MiB/s 81.175 MiB/s 81.650 MiB/s] change: time: [-58.866% -58.606% -58.429%] (p = 0.00 < 0.05) thrpt: [+140.55% +141.58% +143.11%] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild 1-conn/1-100mb-resp (aka. Download)/client: 💚 Performance has improved.time: [160.89 ms 169.02 ms 176.88 ms] thrpt: [565.37 MiB/s 591.64 MiB/s 621.56 MiB/s] change: time: [-84.839% -83.199% -80.866%] (p = 0.00 < 0.05) thrpt: [+422.62% +495.22% +559.59%] Found 2 outliers among 10 measurements (20.00%) 1 (10.00%) high mild 1 (10.00%) high severe 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [390.15 ms 393.63 ms 397.11 ms] thrpt: [25.182 Kelem/s 25.405 Kelem/s 25.631 Kelem/s] change: time: [-2.3395% -1.1675% +0.0092%] (p = 0.06 > 0.05) thrpt: [-0.0092% +1.1813% +2.3956%] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild 1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.time: [41.889 ms 42.076 ms 42.255 ms] thrpt: [23.666 elem/s 23.766 elem/s 23.873 elem/s] change: time: [-0.1532% +0.3684% +0.8589%] (p = 0.16 > 0.05) thrpt: [-0.8516% -0.3670% +0.1535%] Found 28 outliers among 100 measurements (28.00%) 11 (11.00%) low severe 3 (3.00%) low mild 14 (14.00%) high severe Client/server transfer resultsTransfer of 134217728 bytes over loopback.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing PMTUD tests, which would be necessary for this.
The big question I have is the one that Christian makes about PTMUD generally: how do you know that the bytes you use on PMTUD pay you back?
There is probably a case for sending probes when you have spare sending capacity and nothing better to send. Indeed, successfully probing will let us push congestion windows up more and could even improve performance.
What I'm seeing here displaces other data. I'd like to see something that doesn't do that. There's a fundamental problem that needs analysis though. You can't predict that a connection will be used for uploads, so you don't know when probes will really help. I see a few cases:
- The connection is short-lived or low volume. Probes are strictly wasteful.
- The connection is long-lived and high volume, with ample idle time for probing. Probes can use gaps. This might be a video stream, where probing can fit into a warmup period. Probes are therefore easy and super-helpful.
- The connection exists only to support a smaller upload. The upload is small enough that probes are wasteful.
- The connection exists only to support a larger upload. The upload is large enough that spending bytes on probing early on is a good investment.
Case 1 and 2 are easy to deal with. We could probe on an idle connection and tolerate a small amount of waste for case 1 if it makes case 2 appreciably better.
The split between 3 and 4 is rough. There is an uncertain zone between the two as well where some probing is justified, but successive rounds of probing might be wasteful as the throughput gain over the remaining time diminishes relative to the wasted effort of extra probes.
Right now, you don't send real data in probes. You are effectively betting on the probes being lost. But you could send data, which would reduce the harm in case 3. It might even make the code slightly simpler.
Co-authored-by: Martin Thomson <mt@lowentropy.net> Signed-off-by: Lars Eggert <lars@eggert.org>
now: Instant, | ||
) { | ||
// Track lost probes | ||
let lost = self.count_pmtud_probes(lost_packets); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a scenario where lost
can be larger than 1
? In other words, is there a scenario where more than one probe is in-flight at once?
If I understand correctly prepare_probe
is only called in Probe::Needed
. prepare_probe
switches to Probe::Prepared
. The state only switches back to Probe::Needed
, when the in-flight probe has been lost or acked. Thus there is never more than one probe in-flight.
If the above is correct, why is there a count_pmtud_probes
function? Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to implement the algorithm @martinthomson outlined. There may well be optimizations.
Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>
Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>
Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>
Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>
Making this a draft until the test failure is fixed. |
This implements a simplified variant of PLDPMTUD (aka RFC8899), which by default probes for increasingly larger PMTUs using an MTU table.
There is currently no attempt to repeat the PMTUD at intervals.There is also no attempt to detect PMTUs that are in between values in the table.There is no attempt to handle the case where the PMTU shrinks.A lot of the existing tests (~50%) break when PMTUD is enabled, so this PR disables it by default. New tests that cover PMTUD were added to this PR.
Fixes #243