Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct memory leak when getting object metadata inside parallelStream with Java 17 #3067

Open
winzsanchez opened this issue Dec 7, 2023 · 2 comments
Labels
bug This issue is a bug. p2 This is a standard priority issue

Comments

@winzsanchez
Copy link

Describe the bug

When calling client.getObjectMetadata(bucket, key) inside a parallelStream(), we notice the direct memory usage going up when using Java 17 (and there are multiple objects in the bucket).

This does not happen with Java 11 or when using stream().

Expected Behavior

Direct memory shouldn't be going up.

Current Behavior

When using Java 17 we see the following:

[MEM] Initial memory used: 16384
[MEM] Memory used after serial stream: 16384
[MEM] Memory used after parallel stream: 57344
[MEM] Memory used after serial stream: 57344

With Java 11:

[MEM] Initial memory used: 8192
[MEM] Memory used after serial stream: 8192
[MEM] Memory used after parallel stream: 8192
[MEM] Memory used after serial stream: 8192

Reproduction Steps

    protected static BufferPoolMXBean directMemoryMXBean;

    static {
        for (final BufferPoolMXBean pool : ManagementFactory.getPlatformMXBeans(BufferPoolMXBean.class)) {
            if (pool.getName().equals("direct")) {
                directMemoryMXBean = pool;
            }
        }
    }

    public static void listObjects(AmazonS3 client, String bucketName, String prefix) throws InterruptedException {
        ObjectListing listing = client.listObjects( bucketName, prefix );
        var objectSummaries = listing.getObjectSummaries();

        while (listing.isTruncated()) {
            listing = client.listNextBatchOfObjects (listing);
            objectSummaries.addAll (listing.getObjectSummaries());
        }
        System.out.println("objectSummaries: " + objectSummaries);

        System.out.println("[MEM] Initial memory used: " + directMemoryMXBean.getMemoryUsed());
        objectSummaries.forEach(printMetadata(client, bucketName));
        System.out.println("[MEM] Memory used after serial stream: " + directMemoryMXBean.getMemoryUsed());

        objectSummaries.parallelStream().forEach(printMetadata(client, bucketName));
        System.out.println("[MEM] Memory used after parallel stream: " + directMemoryMXBean.getMemoryUsed());

        objectSummaries.forEach(printMetadata(client, bucketName));
        System.out.println("[MEM] Memory used after serial stream: " + directMemoryMXBean.getMemoryUsed());
    }

    private static Consumer<S3ObjectSummary> printMetadata(AmazonS3 client, String bucketName) {
        return s -> System.out.println(s + " metadata: " + getMetadata(client, bucketName, s.getKey()));
    }

    public static ObjectMetadata getMetadata(AmazonS3 client, String bucketName, String name) {
        return client.getObjectMetadata(bucketName, name);
    }

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

1.12.606

JDK version used

java version "17.0.9" 2023-10-17 LTS Java(TM) SE Runtime Environment (build 17.0.9+11-LTS-201) Java HotSpot(TM) 64-Bit Server VM (build 17.0.9+11-LTS-201, mixed mode, sharing)

Operating System and version

Windows 10

@winzsanchez winzsanchez added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Dec 7, 2023
@debora-ito debora-ito added the p2 This is a standard priority issue label Dec 13, 2023
@debora-ito
Copy link
Member

@winzsanchez can you generate the SDK client-side metrics? We do generate metrics about used memory, it would be interesting to see a comparison of the cases side by side.

For instructions on how to generate the client-side metrics please check our Developer Guide.
Also check our blog post that shows how to interpret the metrics: Tuning the AWS SDK for Java to Improve Resiliency
- this is mostly about timeouts and retries, not so much on memory usage, but it's informative.

@debora-ito debora-ito added response-requested Waiting on additional info or feedback. Will move to "closing-soon" in 5 days. and removed needs-triage This issue or PR still needs to be triaged. labels Dec 13, 2023
@winzsanchez
Copy link
Author

JvmMetric-2023_12_18_10_45_00-2023_12_18_11_20_00-UTC-5.csv

image

Hi @debora-ito, attached are the memory metrics.
The ones from 10:45 to 10:55 are with Java 11.
11:05 to 11:20 are with Java 17

@github-actions github-actions bot removed the response-requested Waiting on additional info or feedback. Will move to "closing-soon" in 5 days. label Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants