Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response decompression in Apache Http Client 5 #1112

Open
aleksey-petrov-github opened this issue Aug 16, 2021 · 4 comments
Open

Response decompression in Apache Http Client 5 #1112

aleksey-petrov-github opened this issue Aug 16, 2021 · 4 comments

Comments

@aleksey-petrov-github
Copy link

aleksey-petrov-github commented Aug 16, 2021

There is known issue (#135) : The Logbook HTTP Client integration is handling gzip-compressed response entities incorrectly if the interceptor runs before a decompressing interceptor. Since logging compressed contents is not really helpful it's advised to register the logbook interceptor as the last interceptor in the chain.

This advice is not suitable for Apache Http Client 5, because ResponseContentEncoding interceptor now is implemented as ContentCompressionExec handler which is called after interceptors.
So the logged content is compressed and also body filters are broken because they test compressed content.

For my project workaround is to implement strategy, that wraps response with decompressing logic:

package a.b.c;

import org.apache.hc.client5.http.entity.DeflateInputStreamFactory;
import org.apache.hc.client5.http.entity.GZIPInputStreamFactory;
import org.apache.hc.client5.http.entity.InputStreamFactory;
import org.zalando.logbook.HttpRequest;
import org.zalando.logbook.HttpResponse;
import org.zalando.logbook.Strategy;

import java.io.IOException;
import java.util.Map;

public class ResponseDecompressingStrategy implements Strategy {

    private static final Map<String, InputStreamFactory> DECODER_REGISTRY = Map.of(
            "gzip", GZIPInputStreamFactory.getInstance(),
            "x-gzip", GZIPInputStreamFactory.getInstance(),
            "deflate", DeflateInputStreamFactory.getInstance());

    @Override
    public HttpResponse process(HttpRequest request, HttpResponse response) throws IOException {
        return new DecompressingHttpResponse(response.withBody(), DECODER_REGISTRY);
    }
}
package a.b.c;

import lombok.AllArgsConstructor;
import lombok.NonNull;
import org.apache.hc.client5.http.entity.InputStreamFactory;
import org.apache.hc.core5.http.HttpHeaders;
import org.zalando.logbook.ForwardingHttpResponse;
import org.zalando.logbook.HttpResponse;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.util.Collections;
import java.util.Map;
import java.util.Optional;

@AllArgsConstructor
public class DecompressingHttpResponse implements ForwardingHttpResponse {

    @NonNull
    private final HttpResponse response;

    @NonNull
    private final Map<String, InputStreamFactory> decoderRegistry;

    @Override
    public HttpResponse delegate() {
        return response;
    }

    @Override
    public byte[] getBody() throws IOException {
        return getDecompressedBody();
    }

    @Override
    public String getBodyAsString() throws IOException {
        return new String(getDecompressedBody(), getCharset());
    }

    private byte[] getDecompressedBody() throws IOException {
        byte[] bytes = response.getBody();

        if (bytes.length <= 0) {
            return bytes;
        }

        Optional<InputStreamFactory> inputStreamFactory = Optional.ofNullable(response.getHeaders().get(HttpHeaders.CONTENT_ENCODING))
                .orElseGet(Collections::emptyList)
                .stream()
                .filter(decoderRegistry::containsKey)
                .map(decoderRegistry::get)
                .findFirst();

        if (inputStreamFactory.isEmpty()) {
            return bytes;
        }

        return inputStreamFactory.get().create(new ByteArrayInputStream(bytes)).readAllBytes();
    }
}

It might be not suitable for any reason for others. And also it is not optimal because there is no information about response that helps to define that decompression is needed inside strategy - no access to body length and content encoding header at this step, so decision is made in DecompressingHttpResponse. But strategy is the only place to wrap before FilteredHttpResponse.

In my opinion there are 2 ways to handle decompression:

  1. Reimplement interceptor in logbook-httpclient5 as ExecChainHandler and add it last to exec chain, but seems it is not suitable place for logging, but affects only logbook-httpclient5 module.
  2. Allow Logbook creation with arbitrary decoder registry, which will be applied before response filters in DefaultLogbook and will wrap response with DecompressingHttpResponse passing appropriate decoder (concrete entry value of registry map). But it affects logbook-api and logbook-core modules.

Inline old ResponseContentEncoding interceptor and explicitly include it in common chain for Http Client 5 (like it done for OkHttp* #386) seems not proper way also.

Interesting to know your opinion first before any pull requests.

@whiskeysierra
Copy link
Collaborator

@aleksey-petrov-github Thank you very much for your detailed investigation! 📣

I'd be in favor of option 1, just because it wouldn't affect api and core modules.

@SimplicialCycle
Copy link
Contributor

@whiskeysierra
I hope you like the fix.

@github-actions
Copy link
Contributor

In order to prioritize the support for Logbook, we would like to check whether the old issues are still relevant.
This issue has not been updated for over six months.

  • Please check if it is still relevant in latest version of the Logbook.
  • If so, please add a descriptive comment to keep the issue open.
  • Otherwise, the issue will automatically be closed after a week.

@github-actions github-actions bot added the stale label Jul 17, 2023
@mikhail-putilov
Copy link

mikhail-putilov commented Jul 26, 2023

Hi 🙂

I am currently using the latest Logbook and HttpClient5. I noticed that the fix from above works well but has one critical
undocumented downside:
An additional exec chain .addExecInterceptorFirst("Logbook", new LogbookHttpExecHandler(logbook)) indeed fixes the problem with compression but it does not log anything that is mutated by .addRequestInterceptorFirst(<any interceptor>).

This is important downside that must be at least documented in README, in my opinion. The usual way of building an HTTP client instance is to use HttpClientBuilder the following way:

HttpClients.custom()
                .setDefaultHeaders(List.of(new BasicHeader("name", "value")))
                .setUserAgent("asd")
                .setConnectionManager(...)
...

And in this case, neither user agent, nor default headers are logged by Logbook because of exactly that. setDefaultHeaders and setUserAgent invocations add addRequestInterceptorFirst to PROTOCOL chain inside of HttpClient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants