Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Jest main process memory usage grows unboundedly with v8 coverage #14287

Open
benjaminjkraft opened this issue Jun 29, 2023 · 3 comments

Comments

@benjaminjkraft
Copy link
Contributor

benjaminjkraft commented Jun 29, 2023

Version

29.5.0/main

Steps to reproduce

  1. Have a large codebase, with many test files that import much of the codebase.
  2. Run the tests with the v8 coverage reporter

(I can create a repro here, but the root cause is clear enough that I didn't bother.)

Expected behavior

If I run N tests which import a large codebase, the jest main process memory usage should grow slowly if at all.

Actual behavior

Each test file executed increases the jest main process memory usage by quite a lot -- to the point where running with coverage enabled on our whole codebase would use 16GB or more (that's where I gave up).

Additional context

The issue here is that the main process records the v8 coverage output separately for every test file that it runs, and only merges them at the end. The code that keeps them is here:

this._v8CoverageResults.push(testResult.v8Coverage);

and then the reports don't get merged until here (called in onRunComplete):
const mergedCoverages = mergeProcessCovs(
this._v8CoverageResults.map(cov => ({result: cov.map(r => r.result)})),
);

For a large codebase, these outputs can be 100MB or more, because they include, among other things, the source of every imported file (and, if source maps are enabled, the pre-transpilation source too), as well as some representation of its control-flow graph.

My assumption is that the best way to fix this is to merge as we go, like the babel coverage reporter does, although I don't know if that would have too much perf cost. (Maybe merge whenever total number of yet-unmerged coverage report items is >10k or something?)

I suspect this may be behind #5837, and perhaps other issues with memory usage with --coverage, but of course it's hard to tell.

Environment

System:
    OS: macOS 13.4
    CPU: (10) x64 Apple M1 Max
  Binaries:
    Node: 16.17.1 - /nix/store/fyaisfdhkviz6sb628qxlmy5wcasqfhq-nodejs-16.17.1/bin/node
    npm: 8.12.1 - ~/.local/bin/npm
@benjaminjkraft
Copy link
Contributor Author

benjaminjkraft commented Jul 19, 2023

Ok, I now have a POC fix for this. The changes are roughly:

  1. if the coverage map gets too big, call v8Coverage.mergeProcessCovs immediately rather than at end of run
  2. don't put coverage data in aggregatedResults

Part 1 I currently have set to run when there are more than 10k unmerged ScriptCoverage objects. That caps memory usage somewhere around 750MB in our codebase, the exact numbers will of course vary on codebase size and file size. There's obviously a perf cost to running it more often; probably the best thing to do is to make this configurable, and either have it off or a fairly high value like 10k or 30k by default so that for small projects it will never trigger.

Part 2 is a breaking change as written. I'm not sure how best to handle that; if some reporter genuinely wants the file-by-file coverage data, we need to keep it around. (I can't find any public code on GitHub that obviously does so -- I found ~5 projects, excluding vendors/forks of Jest, that look at v8Coverage at all, and none via aggregatedResults -- but obviously that's no guarantee.) Maybe this goes behind a flag as well, so that large codebases can turn it on as long as they don't use any such reporter? That seems pretty awkward. (Maybe there's a way we can flag it as a deprecated API?)

Anyway, I'll put together a PR, but any suggestions as to how to handle the flagging are welcome. Also suggestions for how to test that new code doesn't, like (2), hold on to references to the original v8Coverage objects after the merge.

@regevbr
Copy link

regevbr commented Jul 27, 2023

I experience the same thing, with coverage process crashes with OOM

@mrazauskas
Copy link
Contributor

For one project I use Jest with c8. It is a CLI app and I have to spawn a process for e2e tests. If the build files have source maps, c8 is able to collect coverage for TypeScript files.

A detail to notice: c8 generate a lot of temporary files with coverage data and processes them at some point. But that was still causing OOM for some users. As a solution recently they introduced the mergeAsync option. See bcoe/c8#469

Sounded relevant to the problem. Also using c8 directly with Jest can be a solution to avoid OOMs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants