Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #557: read proxy response until \r\n\r\n #620

Merged
merged 1 commit into from
Sep 30, 2023
Merged

Conversation

puffi
Copy link
Contributor

@puffi puffi commented May 4, 2023

Headers are terminated by CRLF according to the spec [0]. Between headers and the body there's an additional CRLF. This means we have to parse until we get 2 CRLFs right after each other to know when we can stop reading from the stream.

To do that we read byte by byte and try matching the last 4 bytes when we receive an LF ('\n'). That seems similar to the way curl does it [1].

The response parsing doesn't have to be changed since only the status code is of interest which always comes first [0].

[0] https://www.ietf.org/rfc/rfc2068.txt (6 Response)
[1] https://github.com/curl/curl/blob/89f6fafedbb57b62fddf58223cfa92307bf51a68/lib/cf-h1-proxy.c#L476

Copy link
Owner

@algesten algesten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @puffi

Thanks for this PR! I've made some comments.

src/stream.rs Outdated
proxy_response.append(&mut buf);
if total < 256 {
let mut buf = [0u8; 1];
while let Ok(total) = stream.read(&mut buf) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ? on read in previous version propagated the IO error out. This stops the loop on Err, silently discards it, which is probably not what we want. I suggest keep using the loop + read variant.

src/stream.rs Outdated
proxy_response.push(buf[0]);
match buf[0] {
b'\n' => {
if proxy_response.ends_with(b"\r\n\r\n".as_slice()) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ends_with is a quite expensive call, however it's mitigated by that we only do it when we encounter an \n, so on balance probably ok.

I don't think .as_slice() does anything for you here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs for Vec [0] and the source [1] this is a simple byte comparison of a subslice.
This shouldn't be that expensive.

[0] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.ends_with
[1] https://doc.rust-lang.org/src/core/slice/mod.rs.html#2303-2305

src/stream.rs Outdated
break;
}
proxy_response.push(buf[0]);
match buf[0] {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this match can be replaced with if buf[0] = b'\n' { … }

@puffi
Copy link
Contributor Author

puffi commented May 5, 2023

Thank you for the review!

Seeing the CI runs failing, I remembered that I forgot to run clippy as well.

I've force-pushed the changes.

Copy link
Owner

@algesten algesten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just thought of one more thing.

src/stream.rs Outdated
break;
}

proxy_response.push(buf[0]);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this code is a big improvement on the previous (which looks broken), there's one problem here.

Perhaps a very slim chance, but a misbehaving proxy could potentially just keep writing anything but \r\n\r\n and cause a denial of service attack by exhausting the available RAM by this proxy_response ever growing.

In the "normal" header parsing this is capped by MAX_HEADER_COUNT and MAX_HEADER_SIZE. We need to do something here as well. Ideally it would use the same logic as the main header parsing (maybe we can break that out into a separate subfunction and reuse in both call sites?)

Or we need to cap the proxy_response at something like 1MB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will look into that.

Copy link
Contributor Author

@puffi puffi May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into it and we could actually just use the Response struct for parsing the response.
The easiest way to do this would be to pass a Stream to Response::do_from_stream. It requires cloning the TcpStream though.
I'm not sure if I'm missing something regarding Stream and its handling of the underlying TcpStream, especially in regards to dropping.

The response can then be used to check the status code instead of parsing the string again in verify_response.

Would that be acceptable for you or would you prefer a different solution?

Edit:
The whole code in stream.rs would be replaced by this:

// build Stream from TcpStream
let response = Response::do_from_stream(stream, ...)?;
Proxy::verify_response(&response)?;

And verify_response would be replaced with this:

match response.status() {
    200 => Ok(()),
    401 | 407 => Err(ErrorKind::ProxyUnauthorized.new()),
     _ => Err(ErrorKind::ProxyConnect.new()),
}

Headers are terminated by CRLF according to the spec [0]. Between
headers and the body there's an additional CRLF. This means we have to
parse until we get 2 CRLFs right after each other to know when we can
stop reading from the stream.

Since the `Response` struct already handles this, we can reuse that
instead of custom stream reading and response parsing.

[0] https://www.ietf.org/rfc/rfc2068.txt (6 Response)

Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
@algesten algesten merged commit a07f4f4 into algesten:main Sep 30, 2023
@algesten
Copy link
Owner

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants