Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Jest startup time and test runtime, particularly when running with coverage, by caching micromatch and avoiding recreating RegExp instances #10131

Merged
merged 8 commits into from Jun 23, 2020

Commits on Jun 7, 2020

  1. Cache micromatch in SearchSource globsToMatcher

    I was profiling some Jest runs at Airbnb and noticed that on my
    MacBook Pro, we can spend over 2 seconds at Jest startup time in
    SearchSource getTestPaths. I believe that this will grow as the size
    of the codebase increases.
    
    Looking at the call stacks, it appears to be calling micromatch
    repeatedly, which calls picomatch, which builds a regex out of the
    globs. It seems that the parsing and regex building also triggers the
    garbage collector frequently.
    
    Upon testing, I noticed that the globs don't actually change between
    these calls, so we can save a bunch of work by making a micromatch
    matcher and reusing that function for all of the paths.
    
    micromatch has some logic internally to handle lists of globs that
    may include negated globs. A naive approach of just checking if it
    matched any of the globs won't capture that, so I copied and
    simplified the logic from within micromatch.
    
    https://github.com/micromatch/micromatch/blob/fe4858b0/index.js#L27-L77
    
    In my profiling of this change locally, this brings down the time of
    startRun from about 2000ms to about 200ms.
    lencioni committed Jun 7, 2020
    Copy the full SHA
    65a0405 View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2020

  1. Avoid recreating RegExp instances in regexToMatcher

    After optimizing globsToMatcher, I noticed that there was still a
    lot of unnecessary overhead at Jest startup time spent recreating
    the same RegExp instances repeatedly. Thankfully, we can be a little
    smarter about this and create them all ahead of time and just reuse
    them.
    
    On top of my globsToMatcher optimization, this brings the speed of
    the ArrayMap in startRun down from about 160ms to about 7ms.
    lencioni committed Jun 8, 2020
    Copy the full SHA
    edaa4e1 View commit details
    Browse the repository at this point in the history
  2. Move globsToMatcher from jest-core to jest-util

    I'd like to start using this in more places to improve performance.
    Moving it to jest-util seems like a better spot. Now that it is a
    standalone module, I decided to write some unit tests for this
    function. In doing so, I uncovered a small difference between the
    behavior of this function and micromatch when overlapping glob
    patterns are used which I also fixed.
    lencioni committed Jun 8, 2020
    Copy the full SHA
    e39266d View commit details
    Browse the repository at this point in the history
  3. Teach globsToMatcher to work with empty globs

    While incorporating this function into more places, I discovered a
    discrepancy here with how micromatch works. We can fix this by
    creating a fast path for when there are no globs at all.
    lencioni committed Jun 8, 2020
    Copy the full SHA
    9c01c20 View commit details
    Browse the repository at this point in the history
  4. Optimize micromatch and RegExps in shouldInstrument

    I've been profiling running Jest with code coverage at Airbnb, and
    noticed that shouldInstrument is called often and is fairly
    expensive. It also seems to call micromatch and `new RegExp`
    repeatedly, both of which can be optimized by caching the work to
    convert globs and strings into matchers and regexes.
    
    I profiled this change by running a set of 27 fairly simple tests.
    Before this change, about 6-7 seconds was spent in shouldInstrument.
    After this change, only 400-500 ms is spent there. I would expect
    this delta to increase along with the number of tests and size of
    their dependency graphs.
    lencioni committed Jun 8, 2020
    Copy the full SHA
    793c8c6 View commit details
    Browse the repository at this point in the history
  5. Reduce micromatch overhead in jest-haste-map HasteFS

    I was profiling some Jest runs at Airbnb and noticed that on my
    MacBook Pro, we can spend over 30 seconds after running Jest with code
    coverage as the coverage reporter adds all of the untested files. I
    believe that this will grow as the size of the codebase increases.
    
    Looking at the call stacks, it appears to be calling micromatch
    repeatedly, which calls picomatch, which builds a regex out of the
    globs. It seems that the parsing and regex building also triggers the
    garbage collector frequently.
    
    Since this is in a tight loop and the globs won't change between
    checks, we can greatly improve the performance here by using our new
    and optimized globsToMatcher function, which avoids re-parsing globs
    unnecessarily.
    
    This optimization reduces the block of time here from about 30s to
    about 10s. The aggregated total time of coverage reporter's
    onRunComplete goes from 23s to 600ms.
    lencioni committed Jun 8, 2020
    Copy the full SHA
    f51fd34 View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2020

  1. Add code comments to globsToMatcher

    The logic here might be a little confusing, so I am adding some
    comments that I hope will help make it easier for future explorers to
    understand. While I was doing this, I noticed a small way to simplify
    this function even more.
    lencioni committed Jun 9, 2020
    Copy the full SHA
    42ab2e9 View commit details
    Browse the repository at this point in the history

Commits on Jun 23, 2020

  1. Update globsToMatcher.ts

    cpojer committed Jun 23, 2020
    Copy the full SHA
    03c8004 View commit details
    Browse the repository at this point in the history