Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in [@ neqo_transport::connection::Connection::build_packet_header] #1429

Open
KershawChang opened this issue Apr 26, 2023 · 1 comment

Comments

@KershawChang
Copy link
Collaborator

See https://bugzilla.mozilla.org/show_bug.cgi?id=1829989.

The crash reason is called Option::unwrap()on aNone value.
Crash stack:

8 libxul.so core::panicking::panic library/core/src/panicking.rs:114 cfi
9 libxul.so neqo_transport::connection::Connection::build_packet_header third_party/rust/neqo-transport/src/packet/mod.rs:0 cfi
10 libxul.so neqo_transport::connection::Connection::output_close third_party/rust/neqo-transport/src/connection/mod.rs:1850 inlined
10 libxul.so neqo_transport::connection::Connection::output third_party/rust/neqo-transport/src/connection/mod.rs:1760 inlined
10 libxul.so neqo_transport::connection::Connection::process_output third_party/rust/neqo-transport/src/connection/mod.rs:1004 cfi
11 libxul.so neqo_http3::connection_client::Http3Client::process_output third_party/rust/neqo-http3/src/connection_client.rs:850

It's not clear to me where unwrap is called inside function build_packet_header. Maybe @martinthomson knows?

@martinthomson
Copy link
Member

The likely source of the unwrap call is in Path::remote_cid or Path::local_cid. If we don't have a connection ID for the path, we could be in a position where this panics. Those functions will likely be inlined.

The remote connection ID is set when a path becomes permanent. Temporary paths is a concept we use for incoming packets on new paths. We should not be sending CONNECTION_CLOSE on a temporary path, so this is not likely to be the cause of the problem.

The local connection ID is set only for handshake packets. That carries a risk. If we are in a position where we don't have a local connection ID on the path, but we have to send CONNECTION_CLOSE in a long header, we might hit this path. I would look at calls to ensure_error_path to see if we are creating a path without a local connection ID from a short-header packet.

A simple test you might write to evaluate that is one where you haven't completed the handshake, but then a packet containing a garbage frame comes in on a different path. That packet would have a short header (or not, check both). If that crashes, then my theory might be solid.

That's what I'd do to understand the problem, which is always the first thing to try. We might get a better and more comprehensive fix if we do that. A simple patch to prevent the panic might be to check in output_close, whether the path has a local connection ID, before trying to send a close for any PacketNumberSpace that is not PacketNumberSpace::ApplicationData.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants