fix #557: read proxy response until \r\n\r\n #620

puffi · 2023-05-04T16:27:46Z

Headers are terminated by CRLF according to the spec [0]. Between headers and the body there's an additional CRLF. This means we have to parse until we get 2 CRLFs right after each other to know when we can stop reading from the stream.

To do that we read byte by byte and try matching the last 4 bytes when we receive an LF ('\n'). That seems similar to the way curl does it [1].

The response parsing doesn't have to be changed since only the status code is of interest which always comes first [0].

[0] https://www.ietf.org/rfc/rfc2068.txt (6 Response)
[1] https://github.com/curl/curl/blob/89f6fafedbb57b62fddf58223cfa92307bf51a68/lib/cf-h1-proxy.c#L476

algesten

Hi @puffi

Thanks for this PR! I've made some comments.

algesten · 2023-05-05T07:38:22Z

src/stream.rs

-                proxy_response.append(&mut buf);
-                if total < 256 {
+            let mut buf = [0u8; 1];
+            while let Ok(total) = stream.read(&mut buf) {


The ? on read in previous version propagated the IO error out. This stops the loop on Err, silently discards it, which is probably not what we want. I suggest keep using the loop + read variant.

algesten · 2023-05-05T07:44:31Z

src/stream.rs

+                proxy_response.push(buf[0]);
+                match buf[0] {
+                    b'\n' => {
+                        if proxy_response.ends_with(b"\r\n\r\n".as_slice()) {


ends_with is a quite expensive call, however it's mitigated by that we only do it when we encounter an \n, so on balance probably ok.

I don't think .as_slice() does anything for you here.

According to the docs for Vec [0] and the source [1] this is a simple byte comparison of a subslice.
This shouldn't be that expensive.

[0] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.ends_with
[1] https://doc.rust-lang.org/src/core/slice/mod.rs.html#2303-2305

algesten · 2023-05-05T07:45:05Z

src/stream.rs

                    break;
                }
+                proxy_response.push(buf[0]);
+                match buf[0] {


I think this match can be replaced with if buf[0] = b'\n' { … }

puffi · 2023-05-05T08:33:44Z

Thank you for the review!

Seeing the CI runs failing, I remembered that I forgot to run clippy as well.

I've force-pushed the changes.

algesten

Looks good! Just thought of one more thing.

algesten · 2023-05-05T08:50:41Z

src/stream.rs

+                    break;
+                }
+
+                proxy_response.push(buf[0]);


While this code is a big improvement on the previous (which looks broken), there's one problem here.

Perhaps a very slim chance, but a misbehaving proxy could potentially just keep writing anything but \r\n\r\n and cause a denial of service attack by exhausting the available RAM by this proxy_response ever growing.

In the "normal" header parsing this is capped by MAX_HEADER_COUNT and MAX_HEADER_SIZE. We need to do something here as well. Ideally it would use the same logic as the main header parsing (maybe we can break that out into a separate subfunction and reuse in both call sites?)

Or we need to cap the proxy_response at something like 1MB.

Ok, will look into that.

I've looked into it and we could actually just use the Response struct for parsing the response.
The easiest way to do this would be to pass a Stream to Response::do_from_stream. It requires cloning the TcpStream though.
I'm not sure if I'm missing something regarding Stream and its handling of the underlying TcpStream, especially in regards to dropping.

The response can then be used to check the status code instead of parsing the string again in verify_response.

Would that be acceptable for you or would you prefer a different solution?

Edit:
The whole code in stream.rs would be replaced by this:

// build Stream from TcpStream let response = Response::do_from_stream(stream, ...)?; Proxy::verify_response(&response)?;

And verify_response would be replaced with this:

match response.status() { 200 => Ok(()), 401 | 407 => Err(ErrorKind::ProxyUnauthorized.new()), _ => Err(ErrorKind::ProxyConnect.new()), }

Headers are terminated by CRLF according to the spec [0]. Between headers and the body there's an additional CRLF. This means we have to parse until we get 2 CRLFs right after each other to know when we can stop reading from the stream. Since the `Response` struct already handles this, we can reuse that instead of custom stream reading and response parsing. [0] https://www.ietf.org/rfc/rfc2068.txt (6 Response) Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>

algesten · 2023-09-30T08:21:29Z

Thanks!

algesten requested changes May 5, 2023

View reviewed changes

puffi force-pushed the fix_557 branch from a654c14 to cf44d84 Compare May 5, 2023 08:33

algesten requested changes May 5, 2023

View reviewed changes

puffi force-pushed the fix_557 branch from cf44d84 to c9fb544 Compare May 8, 2023 13:31

algesten approved these changes Sep 30, 2023

View reviewed changes

algesten merged commit a07f4f4 into algesten:main Sep 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #557: read proxy response until \r\n\r\n #620

fix #557: read proxy response until \r\n\r\n #620

puffi commented May 4, 2023

algesten left a comment

algesten May 5, 2023

algesten May 5, 2023

puffi May 5, 2023

algesten May 5, 2023

puffi commented May 5, 2023

algesten left a comment

algesten May 5, 2023

puffi May 5, 2023

puffi May 5, 2023 •

edited

algesten commented Sep 30, 2023

fix #557: read proxy response until \r\n\r\n #620

fix #557: read proxy response until \r\n\r\n #620

Conversation

puffi commented May 4, 2023

algesten left a comment

Choose a reason for hiding this comment

algesten May 5, 2023

Choose a reason for hiding this comment

algesten May 5, 2023

Choose a reason for hiding this comment

puffi May 5, 2023

Choose a reason for hiding this comment

algesten May 5, 2023

Choose a reason for hiding this comment

puffi commented May 5, 2023

algesten left a comment

Choose a reason for hiding this comment

algesten May 5, 2023

Choose a reason for hiding this comment

puffi May 5, 2023

Choose a reason for hiding this comment

puffi May 5, 2023 • edited

Choose a reason for hiding this comment

algesten commented Sep 30, 2023

puffi May 5, 2023 •

edited