Consistent Memory Increase in Webflux Application #3154

aspOEDev · 2024-04-14T12:14:04Z

I am relatively new to reactor framework and I have created a new BFF layer service for our application integrating with 7 different downstream systems using Webflux but we are observing gradual memory increase in our pods memory consumption. In cases when there are timeouts or downstream failures the memory start spiking and does not comeback to normal until a restart of the pod is done.

Below are the versions we have used:

Java - 17.0.2
spring-boot-starter-webflux - 2.7.15
spring-webflux - 5.3.29
spring-cloud-starter-gateway - 3.1.4

Below is how I have initialized our webclient in a generic client service.

public ResponseSpec get(String url, HttpHeaders headers, int timeOutinMillis,
                            Function<UriBuilder, URI> uriFunction) {
        log.info("Building get request for {}, headers {}", url, headers);
        WebClient.Builder webClientBuilder = WebClient.builder();
        metricsWebClientCustomizer.customize(webClientBuilder);

        WebClient client = webClientBuilder.baseUrl(url).build();
        return client.get().uri(uriFunction).headers(h -> h.addAll(headers)).httpRequest(httpRequest -> {
            HttpClientRequest reactorRequest = httpRequest.getNativeRequest();
            reactorRequest.responseTimeout(Duration.ofMillis(timeOutinMillis));
        }).retrieve();
    }

    public ResponseSpec post(String url, HttpHeaders headers, Object body, int timeOutinMillis,
                             Function<UriBuilder, URI> uriFunction) {
        log.info("Building post request for {}, headers {}, body {}", url, headers, body);
        WebClient.Builder webClientBuilder = WebClient.builder();
        metricsWebClientCustomizer.customize(webClientBuilder);

        WebClient client = webClientBuilder.baseUrl(url).build();
        return client.post().uri(uriFunction).headers(h -> h.addAll(headers)).contentType(MediaType.APPLICATION_JSON)
                .bodyValue(body)
                .httpRequest(httpRequest -> {
                    HttpClientRequest reactorRequest = httpRequest.getNativeRequest();
                    reactorRequest.responseTimeout(Duration.ofMillis(timeOutinMillis));
                }).retrieve();
    }

Initially I was using the Autowired WebClient.Builder instance to initialise the client but with increase in load I observed calls in the same downstream client going to wrong apis resulting in mixing of calls. So I changed the approach to use WebClient.builder() to create new builder instance everytime as suggested on some blogs and that solved the wrong call issue. We also reduced the logs to improve memory consumption post heap dump analysis but I have only been able to make the service take longer time to crash.

This is how the memory trend looks like during service startup and then the curve becomes relatively flat but there is always a gradual increase until the service crashes:

We are running this using tomcat in Kubernetes.

Below are the current heap dump dominator tree screenshots

The primary suspect as per heap dump analysis is the following class:

One instance of org.apache.tomcat.util.collections.SynchronizedStack loaded by org.springframework.boot.loader.LaunchedURLClassLoader @ 0xad2b9908 occupies 2,07,15,496 (20.88%) bytes. The memory is accumulated in one instance of java.lang.Object[], loaded by <system class loader>, which occupies 2,07,15,464 (20.88%) bytes.

Thread java.lang.Thread @ 0xaf1017a0 http-nio-8074-Acceptor has a local variable or reference to org.apache.tomcat.util.net.NioEndpoint @ 0xad8c4318 which is on the shortest path to java.lang.Object[500] @ 0xb1cb1e38. The thread java.lang.Thread @ 0xaf1017a0 http-nio-8074-Acceptor keeps local variables with total size 408 (0.00%) bytes.

Stacktrace of issue causing thread in heap dump

http-nio-8074-Acceptor
  at sun.nio.ch.Net.accept(Ljava/io/FileDescriptor;Ljava/io/FileDescriptor;[Ljava/net/InetSocketAddress;)I (Net.java(Native Method))
  at sun.nio.ch.ServerSocketChannelImpl.implAccept(Ljava/io/FileDescriptor;Ljava/io/FileDescriptor;[Ljava/net/SocketAddress;)I (ServerSocketChannelImpl.java:425)
  at sun.nio.ch.ServerSocketChannelImpl.accept()Ljava/nio/channels/SocketChannel; (ServerSocketChannelImpl.java:391)
  at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept()Ljava/nio/channels/SocketChannel; (NioEndpoint.java:548)
  at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept()Ljava/lang/Object; (NioEndpoint.java:79)
  at org.apache.tomcat.util.net.Acceptor.run()V (Acceptor.java:129)
  at java.lang.Thread.run()V (Thread.java:833)

I tried reproducing this on local setup but I am not able to reproduce this issue. Any suggestions or guidance on where I can improve so as to improve the application performance would really be helpful.

The text was updated successfully, but these errors were encountered:

violetagg · 2024-04-15T09:18:59Z

@aspOEDev

Initially I was using the Autowired WebClient.Builder instance to initialise the client but with increase in load I observed calls in the same downstream client going to wrong apis resulting in mixing of calls. So I changed the approach to use WebClient.builder() to create new builder instance everytime as suggested on some blogs and that solved the wrong call issue. We also reduced the logs to improve memory consumption post heap dump analysis but I have only been able to make the service take longer time to crash.

Please clarify whether you create the WebClient for every request?

If yes then please check this: https://stackoverflow.com/questions/77715508/httpclient-recomendations

violetagg · 2024-04-15T09:21:10Z

@aspOEDev The mentioned versions are quite old, please upgrade to the latest supported versions.

aspOEDev · 2024-04-15T11:47:37Z

@violetagg thanks for the suggestions.

@aspOEDev

Initially I was using the Autowired WebClient.Builder instance to initialise the client but with increase in load I observed calls in the same downstream client going to wrong apis resulting in mixing of calls. So I changed the approach to use WebClient.builder() to create new builder instance everytime as suggested on some blogs and that solved the wrong call issue. We also reduced the logs to improve memory consumption post heap dump analysis but I have only been able to make the service take longer time to crash.

Please clarify whether you create the WebClient for every request?

If yes then please check this: https://stackoverflow.com/questions/77715508/httpclient-recomendations

Yes I am creating a new webclient and builder instance per request. In our initial iteration we had used a common autowired builder instance which resulted in api and request content getting mixed up and wrong calls being fired. Hence we went ahead with the safest approach of creating new instance per request. I understand we can cache client instance per host and reduce the memory footprint to some extent. Also will upgrade versions and validate.

Let me give it a try an get back to you.

github-actions · 2024-04-23T06:27:43Z

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

github-actions · 2024-05-01T06:27:37Z

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.

violetagg self-assigned this Apr 15, 2024

violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label Apr 15, 2024

github-actions bot added the status/need-feedback label Apr 23, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 1, 2024

violetagg added status/invalid We don't feel this issue is valid and removed for/user-attention This issue needs user attention (feedback, rework, etc...) status/need-feedback labels May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent Memory Increase in Webflux Application #3154

Consistent Memory Increase in Webflux Application #3154

aspOEDev commented Apr 14, 2024 •

edited

violetagg commented Apr 15, 2024

violetagg commented Apr 15, 2024

aspOEDev commented Apr 15, 2024

github-actions bot commented Apr 23, 2024

github-actions bot commented May 1, 2024

Consistent Memory Increase in Webflux Application #3154

Consistent Memory Increase in Webflux Application #3154

Comments

aspOEDev commented Apr 14, 2024 • edited

violetagg commented Apr 15, 2024

violetagg commented Apr 15, 2024

aspOEDev commented Apr 15, 2024

github-actions bot commented Apr 23, 2024

github-actions bot commented May 1, 2024

aspOEDev commented Apr 14, 2024 •

edited