Skip to content

stokpop/memory-check

Repository files navigation

Stokpop Memory Check

Reads multiple memory histograms from standard Java histogram dumps and report on possible memory leaks.

Sponsored by Rabobank.

Quick start

Point to a directory with histogram-dumps and the file extension used for the histogram files. In this example the files are called memory-dump-2021-03-03T00-30-52.435.histo. How to create multiple histograms over time is described below.

java -jar memory-check-exec.jar -d /path/to/histo-dumps -e histo

summary of findings

The analysis result categories are:

analysis result count icon description
GROW_CRITICAL 61 critical growth detected (above 'maximum allowed growth percentage')
GROW_MINOR 3 minor growth detected (below 'maximum allowed growth percentage')
GROW_SAFE 4 growth detected in 'safe list' of known growing classes
GROW_HICK_UPS 28 growth with hick-ups (less than 'minimum growth points percentage')
SHRINK_AND_GROW 320 both shrink and growth detected
STABLE 533 all histograms show same number of objects
SHRINK 2 opposite of growth: only shrinks detected
UNKNOWN 0 ? no matching analysis result

Concepts

Histogram dumps contain lines for all classes in the jvm with the number of instances and the total number of bytes used by these classes. You can specify to list the live objects only. Because we are interested in memory leaks, we do not want the non-live objects that can be collected by a garbage collect.

Contents of a histogram dump file, we only show the 'top-10':

 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        181128       21229800  [B (java.base@11.0.10)
   2:        188244        6023808  java.util.concurrent.ConcurrentHashMap$Node (java.base@11.0.10)
   3:        162431        3898344  java.lang.String (java.base@11.0.10)
   4:         40944        3603072  java.lang.reflect.Method (java.base@11.0.10)
   5:         10342        3054664  [Ljava.util.concurrent.ConcurrentHashMap$Node; (java.base@11.0.10)
   6:         23108        2931360  java.lang.Class (java.base@11.0.10)
   7:         59338        2848224  org.aspectj.weaver.reflect.ShadowMatchImpl
   8:         15952        2171032  [I (java.base@11.0.10)
   9:         59338        1898816  org.aspectj.weaver.patterns.ExposedState
  10:         36916        1790504  [Ljava.lang.Object; (java.base@11.0.10)
  ...

On number 1 is the [B class name, which are byte arrays. See this overview of Java VM Type Signatures. Byte arrays are for instance backing arrays of Strings, which are on place 3. On spot 10 we see arrays of Objects: [Ljava.lang.Object;.

The number of bytes is the bytes taken by the instances itself, not by referent instances. For instance, the String classes take 3.898.344 bytes for 162.431 instances, which is basically 24 bytes per instance.

Analysis

All instances and bytes for a class name is matched to the instances and bytes in all histograms, ordered on time. So a timestamp in the histogram dump names is essential.

Next, the possible growth is checked over time, looking at the number of bytes per class.

The overall analysis shows classification of all classes that have more bytes in last histogram than specified via --bytes-limit.

The detailed analysis and graphs only report the specified classifications (e.g. only grow_critical and grow_minor) and only specified top number of classes (with --class-limit option).

growth

If we find an increase in bytes in all histograms, the class is said to grow. This might be a memory leak.

If the increase in bytes, from the first dump to the last dump is above a certain percentage threshold, the growth is marked as critical. Say 5 percent. If it is below this threshold, it is considered a minor growth. Use the --max-growth-percentage option.

growth hick ups

Some classes do not grow on every histogram dump. For a load test with continuous load we expect most (but not all!) leaks to grow continuously with the constant flow of requests. For dumps of production systems this can of course be different, depending on incoming requests over time.

We classify GROWTH_HICK_UPS when from all histograms we see only growth, but not in all histograms. Use the --min-growth-points-percentage to specify the percentage of histograms to should show growth. For instance, when set to 50% and with 11 dumps, at least 5 differences between dumps should show growth to be GROW_CRITICAL or GROW_MINOR. Otherwise it is GROW_HICK_UPS.

Note: no check on bytes growth is done on GROW_HICK_UPS to determine CRITICAL or MINOR.

shrink and grow

Classes that show both shrinks and grows in bytes are classified as SHRINK_AND_GROW.

Note: no check on bytes growth is done on SHRINK_AND_GROW to determine CRITICAL or MINOR.

stable

Classes that show same size in bytes in all histograms are classified as STABLE.

unknown

When last histogram does not contain the class, so we do not know the final number of bytes, the classification in UNKNOWN.

Also when not enough data is available the classification is UNKNOWN. Some dynamicly named classes might for instance only be present in one dump (with instances and bytes).

safe-list

Some classes that show growth might be considered false positives. Or are recognized to grow always during load tests or production run, but cannot be fixed after investigation.

We found some tomcat classes to grow over time during Spring Boot application load tests and we could not find a good way to avoid or solve this growth. It seems to get more stable over time as well. After investigation, we allow these classes to grow but still want to see them in the reports.

For these cases the class can be put on the safe-list. The following happens:

  • the classification is set to GROW_SAFE instead of GROW_CRITICAL or GROW_MINOR when growth is detected
  • in the reports the class names are prefixed by (SL) for Safe-List

Classes can be added to the safe-list via comma separated command line or via a safe-list file. A wildcard can be used with *, e.g. java.lang.* or *Connection*

watch-list

Classes that need special attention can be put on the watch-list. The following happens:

  • even when the number of bytes in the last histogram is below the minimum threshold for including the class in the report details, the class is shown

  • even when the maximum number of classes to report in the details is reached, the classes on the watch list are still included

  • in the reports the class names are prefixed by (WL) for Watch-List

Classes can be added to the watch-list via comma separated command line or via a safe-list file.

Example call:

java -jar memory-check-exec-1.2.1.jar \
  --dir /path/to/jvm-histograms/ \
  --ext histo \
  --id duration-test-123-#ts# \
  --report-dir my-reports \
  --class-limit 42 \
  --bytes-limit 100kb \
  --max-growth-percentage 10 \
  --min-growth-points-percentage 5 \
  --categories grow_critical,grow_minor,grow_safe \
  --safe-list java.lang.String \
  --watch-list java.lang.ThreadLocal,nl.stokpop.*

Use cases

  • Automatically check for possible memory leaks after continuous load tests.
  • Create some dumps during a troubleshoot and run an analysis.
  • Create some dumps to do a memory usage sanity check on applications it test or production.
  • Find memory behaviour of specific classes.

To get to the proper level of reporting, use the available thresholds and settings.

If interested in classes that only grow, no matter how much:

  --class-limit 1000 \
  --bytes-limit 0 \
  --max-growth-percentage 0 \
  --min-growth-points-percentage 100 \
  --categories grow_critical,grow_minor \

Show all for only the classes on watch-list, e.g. all classes with Connection in the name:

  --class-limit 0 \
  --bytes-limit 0 \
  --max-growth-percentage 0 \
  --min-growth-points-percentage 0 \
  --categories all \
  --white-list *Connection*

Will show all Connection related classes of hibernate and apache:

screen shot connections graph

Reports

The memory-check will create a text output on standard out and a json and html report. Use the id option to give the reports a custom name. Use the report-dir option to save it in another location.

The html report show graphs of bytes and instances in use over time.

In the diff (difference) charts you can find leaks if the graph line stays above the zero line, meaning that objects are created but not removed.

screen shot of bytes-diff graph

Generate histogram dump

Use the following command to dump the live objects of a java process.

jmap -histo:live $JAVA_PID > memory-dump-$(date +%Y-%m-%dT%H-%M-%S).histo

Generate 4 dumps with 10 seconds apart for application-name:

./tools/create-dumps.sh application-name 4 10000   

Generate histogram via jmx:

java -cp memory-check-exec.jar nl.stokpop.jmx.FetchHistogramKt localhost 5000 > memory-dump-$(date +%Y-%m-%dT%H-%M-%S).histo

Enable jmx on your java process by adding these jvm options:

-Dcom.sun.management.jmxremote.port=5000
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

Make sure there is a timestamp in the dump filename that memory-check can use to determine the time of the dump. If not found in the filename it will try to use the creation date file attribute, but that might not always be correct, for example after a file copy. Use the following format:

yyyy-MM-ddTHH-mm-ss(.SSS)

Optionally you can also add milliseconds/nanoseconds for better precision. Example filename:

memory-dump-2020-06-17T17-52-32.433463.histo

Histogram file names

Memory-check first checks the filename to contain a ISO date and uses that as a timestamp for the histogram dump. The filename should contain an ISO formatted date like: 2020-06-17T22-25-38.960921 If not found, uses file creation date or file last modified as last resort. Be very careful that these times are correct! It is adviced to always use timestamps in the filenames.

Usage

Usage: memory-check-cli [OPTIONS]

Options:
  -d, --dir TEXT                   Look in this directory for heap histogram
                                   dumps.
  -e, --ext TEXT                   Only process files with this extension,
                                   example: 'histo'
  -i, --id TEXT                    Identifier for the report, example:
                                   'test-run-1234'. Include #ts# for a
                                   timestamp.
  -r, --report-dir TEXT            Full or relative path to directory for the
                                   reports, example: '.' for current directory
  -c, --class-limit INT            Report only the top 'limit' classes,
                                   example: '128'.
  -b, --bytes-limit TEXT           Report class only when last dump has at
                                   least x bytes, example: '2048' or '2KB'
  -cat, --categories TEXT          Comma separated file with categories to
                                   report: 'grow_critical, grow_minor,
                                   grow_safe, grow_hick_ups, shrink_and_grow,
                                   shrink, stable, unknown'. Or 'all'.
                                   Default: 'grow_critical,grow_minor'
  -mgp, --max-growth-percentage FLOAT
                                   Maximum allowed growth in percentage before
                                   reporting a critical growth. Default: 5.0
  -mgpp, --min-growth-points-percentage FLOAT
                                   Minimum percentage of growth points to be
                                   considered growth. Default: 50.0
  -sl, --safe-list TEXT            Comma separated list of fully qualified
                                   classnames that are 'safe to growth'. The
                                   asterisk (*) can be used as wildcard.
                                   Default: ""
  -wl, --watch-list TEXT           Comma separated list of fully qualified
                                   classnames that are 'always watched'
                                   irrelevant of other settings. The asterisk
                                   (*) can be used as wildcard. Default: ""
  -slf, --safe-list-file PATH      The safe list file. Should contain one
                                   fully qualified classname per line.
  -wlf, --watch-list-file PATH     The safe list file. Should contain one
                                   fully qualified classname per line.
  -h, --help                       Show this message and exit

Build executable jar

If you want to build an executable jar yourself use:

./gradlew clean fatJar