Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support UTF8 in request headers #981

Open
Keksoj opened this issue Aug 16, 2023 · 5 comments
Open

Support UTF8 in request headers #981

Keksoj opened this issue Aug 16, 2023 · 5 comments
Assignees
Projects
Milestone

Comments

@Keksoj
Copy link
Member

Keksoj commented Aug 16, 2023

For now Sōzu follows RFC7230 for allowed characters in HTTP headers, that forbids UTF8 characters and returns a 400.

It could be beneficial to perform a simple passthrough, since more and more traffic contains UTF8 in headers (typically in non-english speaking countries).

This may have to be done in Kawa.

@Keksoj Keksoj added this to the v0.16.0 milestone Aug 16, 2023
@Keksoj Keksoj added this to Backlog in Roadmap via automation Aug 16, 2023
@Geal
Copy link
Member

Geal commented Aug 16, 2023

This is bound to cause security issues if sozu and the backend server do not follow the same specification for header parsing. RFC 2616 allowed header values in ISO-8859-1 and it was already causing issues

@Wonshtrum
Copy link
Member

This is one of my concerns but I admit I can't find a scenario that causes security issues. If you can provide examples (for UTF-8 or ISO-8859-1) I would greatly appreciate it.
Header passthrough seems pretty easy and tempting to implement. As the delimiters are one byte US ASCII (: and \r) they can't be part of a larger UTF-8 code point and parsing the following pattern should be resilient:

(?P<key>[^:]+): (?P<value>[^\r]+)\r\n

Unfortunately, this will parse valid as well as malformed UTF-8, so we will not be able to use from_utf8_unchecked anymore, but I can't see another downside.

As a side note, while Cloudflare documents the header charset restriction we already had issues with some of its services adding UTF-8 headers (for example cf-region: Île-de-France)

@Wonshtrum Wonshtrum self-assigned this Aug 22, 2023
@Keksoj
Copy link
Member Author

Keksoj commented Nov 15, 2023

The issue of special characters in headers is addressed in this PR on Kawa if I'm fot mistaken. Should we close the issue @Wonshtrum ?

@Geal
Copy link
Member

Geal commented Nov 15, 2023

This is one of my concerns but I admit I can't find a scenario that causes security issues

Look up HTTP request or header smuggling, there's a lot of fun variants. the basic idea is that you have a proxy and a webserver behind it that do not interpret requests in the same way. An additional request can be sent to the server that the proxy wuld not log, a header might be seen with a different value, etc.

Here's one example:
https://twitter.com/BitK_/status/1351587043814604805

@Wonshtrum
Copy link
Member

Keksoj This PR in Kawa only adds support for ISO-8859-1 to reintroduce the tolerant-http1-parsing feature which has not been working since the introduction of Kawa.
Geal, thanks for providing this case, I will look into it. It seems we will stay on ISO-8859-1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Roadmap
  
Backlog
Development

No branches or pull requests

3 participants