Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent Memory Increase in Webflux Application #3154

Closed
aspOEDev opened this issue Apr 14, 2024 · 5 comments
Closed

Consistent Memory Increase in Webflux Application #3154

aspOEDev opened this issue Apr 14, 2024 · 5 comments
Assignees
Labels
status/invalid We don't feel this issue is valid

Comments

@aspOEDev
Copy link

aspOEDev commented Apr 14, 2024

I am relatively new to reactor framework and I have created a new BFF layer service for our application integrating with 7 different downstream systems using Webflux but we are observing gradual memory increase in our pods memory consumption. In cases when there are timeouts or downstream failures the memory start spiking and does not comeback to normal until a restart of the pod is done.

Below are the versions we have used:

  1. Java - 17.0.2
  2. spring-boot-starter-webflux - 2.7.15
  3. spring-webflux - 5.3.29
  4. spring-cloud-starter-gateway - 3.1.4

Below is how I have initialized our webclient in a generic client service.

public ResponseSpec get(String url, HttpHeaders headers, int timeOutinMillis,
                            Function<UriBuilder, URI> uriFunction) {
        log.info("Building get request for {}, headers {}", url, headers);
        WebClient.Builder webClientBuilder = WebClient.builder();
        metricsWebClientCustomizer.customize(webClientBuilder);

        WebClient client = webClientBuilder.baseUrl(url).build();
        return client.get().uri(uriFunction).headers(h -> h.addAll(headers)).httpRequest(httpRequest -> {
            HttpClientRequest reactorRequest = httpRequest.getNativeRequest();
            reactorRequest.responseTimeout(Duration.ofMillis(timeOutinMillis));
        }).retrieve();
    }

    public ResponseSpec post(String url, HttpHeaders headers, Object body, int timeOutinMillis,
                             Function<UriBuilder, URI> uriFunction) {
        log.info("Building post request for {}, headers {}, body {}", url, headers, body);
        WebClient.Builder webClientBuilder = WebClient.builder();
        metricsWebClientCustomizer.customize(webClientBuilder);

        WebClient client = webClientBuilder.baseUrl(url).build();
        return client.post().uri(uriFunction).headers(h -> h.addAll(headers)).contentType(MediaType.APPLICATION_JSON)
                .bodyValue(body)
                .httpRequest(httpRequest -> {
                    HttpClientRequest reactorRequest = httpRequest.getNativeRequest();
                    reactorRequest.responseTimeout(Duration.ofMillis(timeOutinMillis));
                }).retrieve();
    }

Initially I was using the Autowired WebClient.Builder instance to initialise the client but with increase in load I observed calls in the same downstream client going to wrong apis resulting in mixing of calls. So I changed the approach to use WebClient.builder() to create new builder instance everytime as suggested on some blogs and that solved the wrong call issue. We also reduced the logs to improve memory consumption post heap dump analysis but I have only been able to make the service take longer time to crash.

This is how the memory trend looks like during service startup and then the curve becomes relatively flat but there is always a gradual increase until the service crashes:
image

We are running this using tomcat in Kubernetes.

Below are the current heap dump dominator tree screenshots
image

image

The primary suspect as per heap dump analysis is the following class:

image
One instance of org.apache.tomcat.util.collections.SynchronizedStack loaded by org.springframework.boot.loader.LaunchedURLClassLoader @ 0xad2b9908 occupies 2,07,15,496 (20.88%) bytes. The memory is accumulated in one instance of java.lang.Object[], loaded by <system class loader>, which occupies 2,07,15,464 (20.88%) bytes.

Thread java.lang.Thread @ 0xaf1017a0 http-nio-8074-Acceptor has a local variable or reference to org.apache.tomcat.util.net.NioEndpoint @ 0xad8c4318 which is on the shortest path to java.lang.Object[500] @ 0xb1cb1e38. The thread java.lang.Thread @ 0xaf1017a0 http-nio-8074-Acceptor keeps local variables with total size 408 (0.00%) bytes.

Stacktrace of issue causing thread in heap dump

http-nio-8074-Acceptor
  at sun.nio.ch.Net.accept(Ljava/io/FileDescriptor;Ljava/io/FileDescriptor;[Ljava/net/InetSocketAddress;)I (Net.java(Native Method))
  at sun.nio.ch.ServerSocketChannelImpl.implAccept(Ljava/io/FileDescriptor;Ljava/io/FileDescriptor;[Ljava/net/SocketAddress;)I (ServerSocketChannelImpl.java:425)
  at sun.nio.ch.ServerSocketChannelImpl.accept()Ljava/nio/channels/SocketChannel; (ServerSocketChannelImpl.java:391)
  at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept()Ljava/nio/channels/SocketChannel; (NioEndpoint.java:548)
  at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept()Ljava/lang/Object; (NioEndpoint.java:79)
  at org.apache.tomcat.util.net.Acceptor.run()V (Acceptor.java:129)
  at java.lang.Thread.run()V (Thread.java:833)

I tried reproducing this on local setup but I am not able to reproduce this issue. Any suggestions or guidance on where I can improve so as to improve the application performance would really be helpful.

@violetagg violetagg self-assigned this Apr 15, 2024
@violetagg
Copy link
Member

@aspOEDev

Initially I was using the Autowired WebClient.Builder instance to initialise the client but with increase in load I observed calls in the same downstream client going to wrong apis resulting in mixing of calls. So I changed the approach to use WebClient.builder() to create new builder instance everytime as suggested on some blogs and that solved the wrong call issue. We also reduced the logs to improve memory consumption post heap dump analysis but I have only been able to make the service take longer time to crash.

Please clarify whether you create the WebClient for every request?

If yes then please check this: https://stackoverflow.com/questions/77715508/httpclient-recomendations

@violetagg
Copy link
Member

@aspOEDev The mentioned versions are quite old, please upgrade to the latest supported versions.

@violetagg violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label Apr 15, 2024
@aspOEDev
Copy link
Author

@violetagg thanks for the suggestions.

@aspOEDev

Initially I was using the Autowired WebClient.Builder instance to initialise the client but with increase in load I observed calls in the same downstream client going to wrong apis resulting in mixing of calls. So I changed the approach to use WebClient.builder() to create new builder instance everytime as suggested on some blogs and that solved the wrong call issue. We also reduced the logs to improve memory consumption post heap dump analysis but I have only been able to make the service take longer time to crash.

Please clarify whether you create the WebClient for every request?

If yes then please check this: https://stackoverflow.com/questions/77715508/httpclient-recomendations

Yes I am creating a new webclient and builder instance per request. In our initial iteration we had used a common autowired builder instance which resulted in api and request content getting mixed up and wrong calls being fired. Hence we went ahead with the safest approach of creating new instance per request. I understand we can cache client instance per host and reduce the memory footprint to some extent. Also will upgrade versions and validate.

Let me give it a try an get back to you.

Copy link

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

Copy link

github-actions bot commented May 1, 2024

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 1, 2024
@violetagg violetagg added status/invalid We don't feel this issue is valid and removed for/user-attention This issue needs user attention (feedback, rework, etc...) status/need-feedback labels May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/invalid We don't feel this issue is valid
Projects
None yet
Development

No branches or pull requests

2 participants