feat: DPLPMTUD #1903

larseggert · 2024-05-14T15:05:21Z

This implements a simplified variant of PLDPMTUD (aka RFC8899), which by default probes for increasingly larger PMTUs using an MTU table.

~~There is currently no attempt to repeat the PMTUD at intervals.~~ There is also no attempt to detect PMTUs that are in between values in the table. ~~There is no attempt to handle the case where the PMTU shrinks.~~

A lot of the existing tests (~50%) break when PMTUD is enabled, so this PR disables it by default. New tests that cover PMTUD were added to this PR.

Fixes #243

github-actions · 2024-05-14T15:21:55Z

Failed Interop Tests

QUIC Interop Runner, client vs. server

aioquic vs. neqo-latest: A C1
go-x-net vs. neqo-latest: A L2
kwik vs. neqo-latest: A
lsquic vs. neqo-latest: A
msquic vs. neqo-latest: Z A C1
mvfst vs. neqo-latest: Z 3 A L1 C1
neqo vs. neqo-latest: A C1
neqo-latest vs. aioquic: Z C1
neqo-latest vs. haproxy: Z
neqo-latest vs. kwik: Z
neqo-latest vs. lsquic: Z
neqo-latest vs. msquic: Z A L1 C1
neqo-latest vs. mvfst: DC U A L1 L2 C1 C2
neqo-latest vs. neqo: A
neqo-latest vs. neqo-latest: E A L1
neqo-latest vs. nginx: C1
neqo-latest vs. ngtcp2: Z
neqo-latest vs. quiche: L1 C1
neqo-latest vs. quinn: E A
neqo-latest vs. s2n-quic: R
neqo-latest vs. xquic: Z A C1
ngtcp2 vs. neqo-latest: C20 A L1 C1
picoquic vs. neqo-latest: A L1 C1
quic-go vs. neqo-latest: A L1 C1
quiche vs. neqo-latest: 3 A
quinn vs. neqo-latest: Z E A
s2n-quic vs. neqo-latest: A L1
xquic vs. neqo-latest: M A L1 C1

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B L1 L2 C2 6
chrome vs. neqo-latest: 3
go-x-net vs. neqo-latest: H DC LR M B U C2 6
kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U L1 L2 C1 C2 6 V2
lsquic vs. neqo-latest: H DC LR M S R 3 B E L1 L2 C1 C2 6 V2
msquic vs. neqo-latest: H DC LR C20 M S R B U L1 L2 C2 6 V2
mvfst vs. neqo-latest: H DC LR M B L2 C2 6
neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E L1 L2 C2 6 V2
neqo-latest vs. aioquic: H DC LR C20 M S R 3 B U A L1 L2 C2 6
neqo-latest vs. go-x-net: H DC LR M B U A L2 C2 6
neqo-latest vs. haproxy: H DC LR C20 M S R 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. kwik: H DC LR C20 M S R 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. lsquic: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. msquic: H DC LR C20 M S R B U L2 C2 6 V2
neqo-latest vs. mvfst: H LR M R Z 3 B 6
neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E L1 L2 C1 C2 6 V2
neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U L2 C1 C2 6 V2
neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U A L1 L2 C2 6
neqo-latest vs. ngtcp2: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. picoquic: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B U A L2 C2 6
neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U L2 C2 6
neqo-latest vs. s2n-quic: H DC LR C20 M S 3 B U E A L1 L2 C1 C2 6
neqo-latest vs. xquic: H DC LR C20 M R 3 B U L1 L2 C2 6
ngtcp2 vs. neqo-latest: H DC LR M S R Z 3 B U E L2 C2 6 V2
picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E L2 C2 6 V2
quic-go vs. neqo-latest: H DC LR C20 M S R Z 3 B U L2 C2 6
quiche vs. neqo-latest: H DC LR M S R Z B L1 L2 C1 C2 6
quinn vs. neqo-latest: H DC LR C20 M S R 3 B U L2 C2 6
s2n-quic vs. neqo-latest: H DC LR M S R 3 B E L2 C1 C2 6
xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U L2 C2 6

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

aioquic vs. neqo-latest: U E V2
chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2
go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
kwik vs. neqo-latest: E
lsquic vs. neqo-latest: C20 Z U
msquic vs. neqo-latest: 3 E
mvfst vs. neqo-latest: C20 S R U E V2
neqo-latest vs. aioquic: E V2
neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2
neqo-latest vs. haproxy: E
neqo-latest vs. kwik: E
neqo-latest vs. msquic: 3 E
neqo-latest vs. mvfst: C20 S E V2
neqo-latest vs. nginx: E V2
neqo-latest vs. quic-go: E V2
neqo-latest vs. quiche: E V2
neqo-latest vs. quinn: L1 C1 V2
neqo-latest vs. s2n-quic: Z V2
neqo-latest vs. xquic: S E V2
quic-go vs. neqo-latest: E V2
quiche vs. neqo-latest: C20 U E V2
quinn vs. neqo-latest: L1 C1 V2
s2n-quic vs. neqo-latest: C20 Z U V2
xquic vs. neqo-latest: E V2

mxinden · 2024-05-14T16:15:56Z

(There are also a bunch of warning about unused code that is actually used. I don't understand why that is, since those functions mirror existing ones such as cwnd_avail.)

As far as I can tell the trait function CongestionControl::cwnd_min and its implementation <ClassicCongestionControl<T> as CongestionControl>::cwnd_min are only called in PacketSender::cwnd_min. PacketSender::cwnd_min is only called in testing code. Thus, cargo complains about the 3 not being used.

Does that make sense @larseggert?

neqo-transport/src/path.rs

github-actions · 2024-05-14T16:37:47Z

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

Linux: Debug Release
macOS: Debug Release
Windows: Debug Release

neqo-transport/src/pmtud.rs

neqo-transport/src/path.rs

github-actions · 2024-05-14T17:37:20Z

Benchmark results

Performance differences relative to b900860.

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [196.26 ns 196.72 ns 197.21 ns]
       change: [-0.4267% -0.0721% +0.2810%] (p = 0.70 > 0.05)
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe

coalesce_acked_from_zero 3+1 entries: Change within noise threshold.

       time:   [237.02 ns 237.73 ns 238.43 ns]
       change: [-1.6460% -1.2165% -0.8542%] (p = 0.00 < 0.05)
Found 19 outliers among 100 measurements (19.00%)
  15 (15.00%) high mild
  4 (4.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.

       time:   [236.56 ns 236.90 ns 237.37 ns]
       change: [-1.1836% -0.4907% +0.3412%] (p = 0.26 > 0.05)
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [218.37 ns 218.57 ns 218.79 ns]
       change: [-6.2263% -2.4119% -0.1027%] (p = 0.18 > 0.05)
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  8 (8.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.

       time:   [119.04 ms 119.16 ms 119.28 ms]
       change: [+0.1833% +0.3087% +0.4264%] (p = 0.00 < 0.05)
transfer/Run multiple transfers with varying seeds

time:   [11.673 ms 11.972 ms 12.287 ms]

thrpt:  [325.54 MiB/s 334.11 MiB/s 342.68 MiB/s]

change:

time:   [-90.172% -89.920% -89.677%] (p = 0.00 < 0.05)

thrpt:  [+868.74% +892.07% +917.51%]

:green_heart: Performance has improved.

Found 2 outliers among 100 measurements (2.00%)

2 (2.00%) high mild

transfer/Run multiple transfers with the same seed: 💚 Performance has improved.

       time:   [48.990 ms 49.276 ms 49.442 ms]
       thrpt:  [80.903 MiB/s 81.175 MiB/s 81.650 MiB/s]
change:
       time:   [-58.866% -58.606% -58.429%] (p = 0.00 < 0.05)
       thrpt:  [+140.55% +141.58% +143.11%]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild

1-conn/1-100mb-resp (aka. Download)/client: 💚 Performance has improved.

       time:   [160.89 ms 169.02 ms 176.88 ms]
       thrpt:  [565.37 MiB/s 591.64 MiB/s 621.56 MiB/s]
change:
       time:   [-84.839% -83.199% -80.866%] (p = 0.00 < 0.05)
       thrpt:  [+422.62% +495.22% +559.59%]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) high mild
  1 (10.00%) high severe

1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.

       time:   [390.15 ms 393.63 ms 397.11 ms]
       thrpt:  [25.182 Kelem/s 25.405 Kelem/s 25.631 Kelem/s]
change:
       time:   [-2.3395% -1.1675% +0.0092%] (p = 0.06 > 0.05)
       thrpt:  [-0.0092% +1.1813% +2.3956%]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.

       time:   [41.889 ms 42.076 ms 42.255 ms]
       thrpt:  [23.666  elem/s 23.766  elem/s 23.873  elem/s]
change:
       time:   [-0.1532% +0.3684% +0.8589%] (p = 0.16 > 0.05)
       thrpt:  [-0.8516% -0.3670% +0.1535%]
Found 28 outliers among 100 measurements (28.00%)
  11 (11.00%) low severe
  3 (3.00%) low mild
  14 (14.00%) high severe

Client/server transfer results

Transfer of 134217728 bytes over loopback.

Client	Server	CC	Pacing	Mean [ms]	Min [ms]	Max [ms]	Relative
msquic	msquic			395.7 ± 32.0	357.3	463.3	1.00
neqo	msquic	reno	on	810.7 ± 30.7	771.8	871.2	1.00
neqo	msquic	reno		786.4 ± 17.5	767.5	815.9	1.00
neqo	msquic	cubic	on	802.4 ± 27.9	766.8	843.2	1.00
neqo	msquic	cubic		805.6 ± 63.3	762.3	966.2	1.00
msquic	neqo	reno	on	448.2 ± 32.2	414.6	530.4	1.00
msquic	neqo	reno		504.8 ± 214.6	393.1	1103.0	1.00
msquic	neqo	cubic	on	494.5 ± 39.7	454.1	566.9	1.00
msquic	neqo	cubic		451.1 ± 43.6	378.7	500.1	1.00
neqo	neqo	reno	on	708.0 ± 269.5	571.7	1434.5	1.00
neqo	neqo	reno		578.4 ± 25.3	532.0	624.3	1.00
neqo	neqo	cubic	on	607.7 ± 43.0	525.5	667.6	1.00
neqo	neqo	cubic		597.6 ± 56.4	518.4	705.3	1.00

⬇️ Download logs

neqo-transport/src/crypto.rs

martinthomson

I'm not seeing PMTUD tests, which would be necessary for this.

The big question I have is the one that Christian makes about PTMUD generally: how do you know that the bytes you use on PMTUD pay you back?

There is probably a case for sending probes when you have spare sending capacity and nothing better to send. Indeed, successfully probing will let us push congestion windows up more and could even improve performance.

What I'm seeing here displaces other data. I'd like to see something that doesn't do that. There's a fundamental problem that needs analysis though. You can't predict that a connection will be used for uploads, so you don't know when probes will really help. I see a few cases:

The connection is short-lived or low volume. Probes are strictly wasteful.
The connection is long-lived and high volume, with ample idle time for probing. Probes can use gaps. This might be a video stream, where probing can fit into a warmup period. Probes are therefore easy and super-helpful.
The connection exists only to support a smaller upload. The upload is small enough that probes are wasteful.
The connection exists only to support a larger upload. The upload is large enough that spending bytes on probing early on is a good investment.

Case 1 and 2 are easy to deal with. We could probe on an idle connection and tolerate a small amount of waste for case 1 if it makes case 2 appreciably better.

The split between 3 and 4 is rough. There is an uncertain zone between the two as well where some probing is justified, but successive rounds of probing might be wasteful as the throughput gain over the remaining time diminishes relative to the wasted effort of extra probes.

Right now, you don't send real data in probes. You are effectively betting on the probes being lost. But you could send data, which would reduce the harm in case 3. It might even make the code slightly simpler.

neqo-transport/src/pmtud.rs

neqo-transport/src/pace.rs

neqo-transport/src/path.rs

neqo-transport/src/pmtud.rs

neqo-transport/src/path.rs

Co-authored-by: Martin Thomson <mt@lowentropy.net> Signed-off-by: Lars Eggert <lars@eggert.org>

…dplpmtud

neqo-transport/src/cc/classic_cc.rs

neqo-transport/src/pmtud.rs

mxinden · 2024-05-28T12:39:10Z

neqo-transport/src/pmtud.rs

+        now: Instant,
+    ) {
+        // Track lost probes
+        let lost = self.count_pmtud_probes(lost_packets);


Is there a scenario where lost can be larger than 1? In other words, is there a scenario where more than one probe is in-flight at once?

If I understand correctly prepare_probe is only called in Probe::Needed. prepare_probe switches to Probe::Prepared. The state only switches back to Probe::Needed, when the in-flight probe has been lost or acked. Thus there is never more than one probe in-flight.

If the above is correct, why is there a count_pmtud_probes function? Am I missing something?

I tried to implement the algorithm @martinthomson outlined. There may well be optimizations.

Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>

…dplpmtud

larseggert · 2024-05-31T12:50:24Z

Making this a draft until the test failure is fixed.

larseggert added 5 commits May 13, 2024 12:46

WIP

de41d72

Merge remote-tracking branch 'origin/main' into feat-dplpmtud

a3b0f0c

Fixes

80fd884

Minimize diff

76f3fd4

Progress

3cc307b

larseggert changed the title ~~feat: Groudwork for DPLPMTUD~~ feat: Groundwork for DPLPMTUD May 14, 2024

larseggert commented May 14, 2024

View reviewed changes

neqo-transport/src/path.rs Outdated Show resolved Hide resolved

Fix clippy

9f51395

Reduce diff to main

9ecfda2

mxinden reviewed May 14, 2024

View reviewed changes

neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved

neqo-transport/src/path.rs Outdated Show resolved Hide resolved

larseggert added 3 commits May 15, 2024 14:53

Merge branch 'main' into feat-dplpmtud

213ad01

Use RefCell

19c2f44

Make Pacer use PmtudState

103ac4e

larseggert marked this pull request as ready for review May 15, 2024 16:29

larseggert requested review from KershawChang and martinthomson as code owners May 15, 2024 16:29

larseggert added 5 commits May 16, 2024 16:36

Merge branch 'main' into feat-dplpmtud

cc03dc2

Renamings

3a7a923

Fix tests broken by changing PATH_MTU_V6

b42b525

WIP

e02ddf7

Finalize

b3e4fd0

larseggert changed the title ~~feat: Groundwork for DPLPMTUD~~ feat: DPLPMTUD May 21, 2024

larseggert commented May 21, 2024

View reviewed changes

neqo-transport/src/crypto.rs Outdated Show resolved Hide resolved

Merge branch 'main' into feat-dplpmtud

da3cbb9

martinthomson reviewed May 22, 2024

View reviewed changes

Update neqo-transport/src/path.rs

608e1c6

Co-authored-by: Martin Thomson <mt@lowentropy.net> Signed-off-by: Lars Eggert <lars@eggert.org>

larseggert added 2 commits May 23, 2024 16:52

More comments

68cca5b

Add PMTU_RAISE_TIMER

a114469

larseggert requested review from martinthomson and mxinden May 23, 2024 14:49

larseggert added 8 commits May 24, 2024 15:51

Lost PMTUD probes do not elicit a congestion control reaction

b8219c6

Update pacer when MTU changes

7de5e4e

Better way to update pacer

810864a

Merge branch 'main' into feat-dplpmtud

46f4ed0

Fix last commit

31bb86f

Merge branch 'main' into feat-dplpmtud

3536ad4

Merge branch 'feat-dplpmtud' of github.com:larseggert/neqo into feat-…

439a55a

…dplpmtud

Potential fix for bench

501b150

mxinden reviewed May 28, 2024

View reviewed changes

larseggert and others added 11 commits May 28, 2024 16:34

Update neqo-transport/src/pmtud.rs

bf4f1e4

Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>

Update neqo-transport/src/pmtud.rs

08f2404

Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>

Update neqo-transport/src/pmtud.rs

6b9cb1c

Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>

Merge branch 'main' into feat-dplpmtud

c1b85b9

Update neqo-transport/src/cc/classic_cc.rs

eca5ee9

Co-authored-by: Max Inden <mail@max-inden.de> Signed-off-by: Lars Eggert <lars@eggert.org>

Undo

a7636c4

Simplifications

1dc9d2c

rustfmt

9f88d7f

Disarm raise timer when it fired

4cea3c8

Merge branch 'main' into feat-dplpmtud

a538a58

Merge branch 'feat-dplpmtud' of github.com:larseggert/neqo into feat-…

cbcd621

…dplpmtud

larseggert mentioned this pull request May 30, 2024

FRAME_TYPE_STREAMS_BLOCKED ignored? #1917

Open

test script that triggers the bug

f296efe

larseggert marked this pull request as draft May 31, 2024 12:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: DPLPMTUD #1903

feat: DPLPMTUD #1903

larseggert commented May 14, 2024 •

edited

github-actions bot commented May 14, 2024 •

edited

Succeeded Interop Tests

Unsupported Interop Tests

mxinden commented May 14, 2024

github-actions bot commented May 14, 2024 •

edited

github-actions bot commented May 14, 2024 •

edited

martinthomson left a comment

mxinden May 28, 2024 •

edited

larseggert May 29, 2024

larseggert commented May 31, 2024

feat: DPLPMTUD #1903

Are you sure you want to change the base?

feat: DPLPMTUD #1903

Conversation

larseggert commented May 14, 2024 • edited

github-actions bot commented May 14, 2024 • edited

Failed Interop Tests

Succeeded Interop Tests

Unsupported Interop Tests

mxinden commented May 14, 2024

github-actions bot commented May 14, 2024 • edited

Firefox builds for this PR

github-actions bot commented May 14, 2024 • edited

Benchmark results

Client/server transfer results

martinthomson left a comment

Choose a reason for hiding this comment

mxinden May 28, 2024 • edited

Choose a reason for hiding this comment

larseggert May 29, 2024

Choose a reason for hiding this comment

larseggert commented May 31, 2024

larseggert commented May 14, 2024 •

edited

github-actions bot commented May 14, 2024 •

edited

github-actions bot commented May 14, 2024 •

edited

github-actions bot commented May 14, 2024 •

edited

mxinden May 28, 2024 •

edited