Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic decompression should sanitize Content-Encoding and Content-Length headers from the response #1729

Open
lucacasonato opened this issue Dec 21, 2023 · 1 comment
Labels
needs implementer interest Moving the issue forward requires implementers to express interest needs tests Moving the issue forward requires someone to write tests topic: http

Comments

@lucacasonato
Copy link
Member

What is the issue with the Fetch Standard?

The fetch() spec allows browsers to perform decompression of HTTP responses in fetch() if an appropriate content-encoding header is set on the response. In this case, the Response.prototype.body stream no longer reflects the raw bytes (modulo protocol framing) received on the wire, but instead a processed version of the bytes after being passed through a decompression routine.

This decompression is meant to be transparent to users: they do not have to explicitly opt in or enable it. Further, they can not even disable this (ref #1524).

Unfortunately, the decompression is currently not very transparent: given an arbitrary response object, it is ambiguous whether the Response's body has been decompressed or is still compressed.

This causes real world problems:

  • it poses a hazards when implementers add new encodings for automatic decompression, because a user that was previously manually decompressing a response with an unsupported content encoding, can now not tell whether they need to perform decompression or not after a browser adds native support for decompressing this content encoding
  • proxies can not tell what headers they need to send downstream (For compressed responses, omit Content-Length and Content-Encoding headers wintercg/fetch#23)

Proposal

I propose we strip out Content-Length (because it represents the content length prior to decompression), and Content-Encoding (because it represents the encoding prior to decompression) from Response headers when we perform automatic response body decompression in fetch(). I am not suggestion this affects responses created with new Response() or responses returned from fetch() that do not have automatic response body decompression performed.

Compatibility

I don't think this change will break any existing code. It may skew some folks' monitoring tools. I make this assumption based on the following thoughts:

  • The Content-Length before decompression is meaningless if you only have the decompressed body. You can not infer how long the real response is based on the Content-Length in both gzip and br.
  • The original Content-Encoding is not useful in combination with a compressed body. The only use I can think of is monitoring usecases where you want to determine what percentage of your assets were served with compression (and with which compression).

Prior art

In the JavaScript space:

  • Both Deno and Cloudflare implement this proposed fix to allow for the proxy use case mentioned above

In other programming languages:

@annevk
Copy link
Member

annevk commented Jan 8, 2024

See also #716 and #1358.

In principle I'm supportive of addressing this. I think the main problem here is #716, but perhaps targeted smaller improvements are possible with sufficient test coverage.

@annevk annevk added needs implementer interest Moving the issue forward requires implementers to express interest needs tests Moving the issue forward requires someone to write tests topic: http labels Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs implementer interest Moving the issue forward requires implementers to express interest needs tests Moving the issue forward requires someone to write tests topic: http
Development

No branches or pull requests

2 participants