Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Make neqo pass amplificationlimit QNS test #1875

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

larseggert
Copy link
Collaborator

Fixes #1183

Copy link

github-actions bot commented May 3, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-transport/src/connection/mod.rs Outdated Show resolved Hide resolved
neqo-transport/src/connection/tests/handshake.rs Outdated Show resolved Hide resolved
neqo-transport/src/recovery.rs Outdated Show resolved Hide resolved
qns/interop.sh Show resolved Hide resolved
larseggert and others added 2 commits May 6, 2024 08:23
Co-authored-by: Martin Thomson <mt@lowentropy.net>
Signed-off-by: Lars Eggert <lars@eggert.org>
@larseggert larseggert marked this pull request as ready for review May 6, 2024 09:24
@larseggert
Copy link
Collaborator Author

One test not passing, see #1490 (comment)

@larseggert larseggert linked an issue May 6, 2024 that may be closed by this pull request
Copy link

github-actions bot commented May 7, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

@larseggert
Copy link
Collaborator Author

@martinthomson @KershawChang if any of you understands the timer code, I'd appreciate some help here. The idle_timeout_crazy_rtt persistently crashes with assertion failed: earliest > now, which seems to be a long-standing issue.

It's straightforward to reproduce:

# RUST_LOG=trace cargo nextest run --no-capture idle_timeout_crazy_rtt
...
0s 35ms DEBUG [idle_timeout_crazy_rtt] processing Server c11688b68f20d6a8b4f512e565e90a
0s 35ms TRACE [Server c11688b68f20d6a8b4f512e565e90a] process_output Handshaking Instant { tv_sec: 5648807, tv_nsec: 122490916 }
0s 35ms TRACE IdleTimeout::expiry@Instant { tv_sec: 5648807, tv_nsec: 122490916 } pto=325ms, ka=false => Instant { tv_sec: 610448805, tv_nsec: 622490916 }
0s 35ms TRACE [LossRecovery] timeout Instant { tv_sec: 5648807, tv_nsec: 122490916 }
0s 35ms TRACE detect lost ap: now=Instant { tv_sec: 5648807, tv_nsec: 122490916 } delay=112.5ms
0s 35ms TRACE detect lost hs: now=Instant { tv_sec: 5648807, tv_nsec: 122490916 } delay=112.5ms
0s 35ms TRACE lost=11, time sent Instant { tv_sec: 5648806, tv_nsec: 822490916 } is before lost_delay 112.5ms
0s 35ms DEBUG packet_lost this=0x600002f741e0, pn=11, ps=675
0s 35ms DEBUG [NewReno CongCtrl 662/2674 ssthresh 2674] Cong event -> recovery; cwnd 2674, ssthresh 2674
0s 35ms DEBUG [NewReno CongCtrl 662/2674 ssthresh 2674] state -> RecoveryStart
0s 35ms DEBUG on_packets_lost this=0x600002f741e0, bytes_in_flight=662, cwnd=2674, state=RecoveryStart
0s 35ms TRACE detect lost in: now=Instant { tv_sec: 5648807, tv_nsec: 122490916 } delay=112.5ms
0s 35ms TRACE lost=11, time sent Instant { tv_sec: 5648806, tv_nsec: 822490916 } is before lost_delay 112.5ms
0s 35ms DEBUG packet_lost this=0x600002f741e0, pn=11, ps=662
0s 35ms DEBUG on_packets_lost this=0x600002f741e0, bytes_in_flight=0, cwnd=2674, state=RecoveryStart
0s 35ms DEBUG [Server c11688b68f20d6a8b4f512e565e90a] Lost: Crypto(CryptoRecoveryToken { space: Handshake, offset: 0, length: 628 })
0s 35ms INFO Lost crypto frame space=hs offset=0 length=628
0s 35ms DEBUG [Server c11688b68f20d6a8b4f512e565e90a] Lost: Crypto(CryptoRecoveryToken { space: Initial, offset: 0, length: 90 })
0s 35ms INFO Lost crypto frame space=in offset=0 length=90
0s 35ms TRACE [Server c11688b68f20d6a8b4f512e565e90a] output Instant { tv_sec: 5648807, tv_nsec: 122490916 }
0s 35ms DEBUG [LossRecovery] get send profile Instant { tv_sec: 5648807, tv_nsec: 122490916 }
0s 35ms DEBUG [Server c11688b68f20d6a8b4f512e565e90a] output_path send_profile SendProfile { limit: 255, pto: None, probe: (), paced: false }
0s 35ms DEBUG Building Initial dcid 0900000000c1f2c64a scid 0800000000c32312
0s 35ms DEBUG Building Handshake dcid 0900000000c1f2c64a scid 0800000000c32312
0s 35ms DEBUG Building Short dcid 0900000000c1f2c64a
0s 35ms DEBUG TX blocked, profile=SendProfile { limit: 255, pto: None, probe: (), paced: false }
0s 35ms TRACE [Server c11688b68f20d6a8b4f512e565e90a] Get callback delay Instant { tv_sec: 5648807, tv_nsec: 122490916 }
0s 35ms TRACE ack_time for ap = None
0s 35ms TRACE ack_time for hs = None
0s 35ms TRACE ack_time for in = None
0s 35ms TRACE IdleTimeout::expiry@Instant { tv_sec: 5648807, tv_nsec: 122490916 } pto=325ms, ka=false => Instant { tv_sec: 610448805, tv_nsec: 622490916 }
0s 35ms TRACE [Server c11688b68f20d6a8b4f512e565e90a] Idle/keepalive timer Instant { tv_sec: 610448805, tv_nsec: 622490916 }
0s 35ms TRACE [LossRecovery] next_timeout loss=None pto=Some(Instant { tv_sec: 5648807, tv_nsec: 122490916 })
0s 35ms TRACE [Server c11688b68f20d6a8b4f512e565e90a] Loss recovery timer Instant { tv_sec: 5648807, tv_nsec: 122490916 }
thread 'idle_timeout_crazy_rtt' panicked at neqo-transport/src/connection/mod.rs:1068:9:
assertion failed: earliest > now

@KershawChang
Copy link
Collaborator

KershawChang commented May 7, 2024

@martinthomson @KershawChang if any of you understands the timer code, I'd appreciate some help here. The idle_timeout_crazy_rtt persistently crashes with assertion failed: earliest > now, which seems to be a long-standing issue.

I think we need to implement what Martin suggested to do in this comment.
I'll try to give it a shot.
In order to unblock this, maybe we can remove that debug assertion first?

@larseggert
Copy link
Collaborator Author

larseggert commented May 7, 2024

In order to unblock this, maybe we can remove that debug assertion first?

Then it dies here

thread 'idle_timeout_crazy_rtt' panicked at test-fixture/src/sim/mod.rs:202:25:
assertion `left != right` failed
  left: 0ns
 right: 0ns

And when I take that out, too, it spins forever.

Copy link

github-actions bot commented May 7, 2024

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

fn maybe_fire_pto(&mut self, primary_path: &PathRef, now: Instant, lost: &mut Vec<SentPacket>) {
let path = primary_path.borrow();
if !path.is_valid() {
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larseggert sorry, I just realized that the root cause of this PR is not the same as #1491 .
That assertion earliest > now is caused by this line. If I remove the early return, the test idle_timeout_crazy_rtt passes.

I am still looking into why this change can trigger that assertion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KershawChang any insights?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mark amplificationlimit QNS test as unsupported Disable PTO at server if path is not validated
3 participants