New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Black hole detection false-positives #1855
Comments
Design proposal: A loss burst is said to be suspicious (possibly caused by an MTU reduction) if its smallest packet is larger than any more recently acknowledged packet. Packets are deemed lost when at most one packet threshold (typically set to 3) worth of later packets have been acknowledged. We therefore need to store only at most one packet threshold worth of packet sizes, those of the most recently acknowledged packets not preceding a loss burst, to judge that a future loss burst might be caused by an MTU reduction. Past loss bursts are forgotten immediately upon acknowledgement of a larger packet than their smallest. If the number of past loss bursts exceeds a threshold, then we report a black hole. |
Sounds okay. Should we run this by quicdev or something? Edit: ah, I saw you already did. Should have checked first. 😄 |
Some thoughts from that discussion:
Still seems to me like the approach I sketched above is an attractive sweet spot. |
In reviewing #1858, I'm wondering if this holds? If the MTU reduces, why would we only consider losses to be suspicious only if the smallest packet is larger than any more recently acknowledged packet? The MTU could have decreased past whatever was recently acknowledged, right? |
When a packet is acknowledged, we know that the MTU is at least that packet's size. If a loss burst includes packets that are smaller than a packet that was transmitted later and which has been acknowledged, then we know at least some of the packets in that burst were not dropped for MTU reasons, so most likely the entire burst was lost due to congestion or link errors. |
Black hole detection triggers when the number of "suspicious loss bursts" since the last MTU-sized packet exceeds a threshold:
quinn/quinn-proto/src/connection/mtud.rs
Lines 398 to 409 in a35ad73
Due to approximations in packet assembly, we (almost?) never actually produce an exactly MTU-sized packet. As a result, any loss rate of packets larger than the minimum threshold to be suspicious will eventually trigger black hole detection, resulting in a spurious MTU reduction and loss of throughput.
We should relax the conditions for resetting the counter. For example, if we've confirmed delivery of a packet of a certain size, loss bursts containing packets of equal or smaller size shouldn't be considered suspicious any more.
The text was updated successfully, but these errors were encountered: