New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow more aggressive transform caching in multiproject monorepos #10835
Comments
I think we can make transformers smarter about when to bust transform cache, by picking only configuration that changes the output, instead of relying on one-size-fits-all solutions like stringified config or project name. If it's done well, overriding from transformers shouldn't be necessary. |
Agreed with @thymikee, anything else is hacky. |
@thymikee At the moment in a multi-project run, transformers can't be smart about it - jest will always split the cache into separate caches per project. We could either remove Happy to also look at a follow up change to the babel-jest transformer to make the cache key only depend on the relevant parts of the jest config. For comparison, |
I don't understand the "let's be smarter about busting in transformers" comments - even if the cache key is the same it's a cache miss since Jest will look in different directories for different projects when checking if the cached file exist. If you by "instead of relying on one-size-fits-all solutions like stringified config or project name" mean "remove project name from the algorithm" that has nothing to do with the transformers themselves. That code lives in Making |
That's what I'm after.
Kinda related to my comment here: #10834 (comment) I'm all in for that 馃憤 |
Let's make it so the cache directory is the same across the root project. We should be able to do that when |
Works for me 馃憤 Introduces race conditions between projects (in theory alleviated through |
Sounds like a plan. I'd be up for adding a deprecation warning in Jest 27 and keep it for a while until community adapts, which may take a while. Then remove. Btw, why are we not using |
Dunno. PR welcome? 馃榾 |
One further thought on this - will just removing the Edit: They already do. |
One cent on |
This issue is stale because it has been open for 1 year with no activity. Remove stale label or comment or this will be closed in 14 days. |
This issue was closed because it has been stalled for 7 days with no activity. Please open a new issue if the issue is still relevant, linking to this one. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
馃悰 Bug Report
In multiproject configs, each project root gets its own jest transform cache. This leads to duplicate work transforming the same files, even if they use the same configs. If a file is used in n projects, it will be transformed n times.
Locally, this increases cache disk usage. In CI, where the cache will not be warm, it increases runtime as duplicate work is required.
Writing a custom transformer with a simpler cache key implementation does not solve this - each project gets a separate cache folder.
The relevant code that does this is in https://github.com/facebook/jest/blob/c98b22097cb6faa3ed3fabf197cbe4f466620b9f/packages/jest-transform/src/ScriptTransformer.ts#L132-L136 - forces a unique cache path per
config.name
If unassigned,
config.name
is assigned to a hash based on the path and index.I've tried adding a common
name
to all projects' jest configs. That fixes the transform problem, but breaks other things (manual mocks in an__mocks__
folder don't work consistently). On our large monorepo, this gave a ~30% improvement in total runtime, but__mocks__
becoming unpredictableI appreciate that there are edge cases to handle here (potentially different jest configs could warrant a different cache), but I think it should be available for the jest transformer to decide whether this is important (e.g. if relevant, a transformer could include config.name in the cache key manually).
e.g. optionally allow transformers to provide their own implementations of
getCacheFilePath
, which overrides the use ofHasteMap.getCacheFilePath( this._config.cacheDirectory, 'jest-transform-cache-' + this._config.name, VERSION, )
If this sort of change would be accepted, I can probably provide a PR.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
If the transform config is the same, each file is only transformed once
Link to repl or repo (highly encouraged)
https://github.com/lexanth/jest-projects-repro
This is a monorepo with 3 packages (A, B and C). A and B consume C. the code in C currently gets transformed once per package, even with the transformer (in the jest-preset package) giving a super aggressive cache key implementation (
yarn test:ci
- could be used e.g. in CI, if we know the other relevant configs are constant).Adding
name: process.env.USE_SIMPLIFIED_CACHE ? '_' : undefined
to each package's jest config makes them all use the same cache, but in my actual repo breaks other things, being a bit of a hack.Everything is running in band because the tests are so fast that multiple workers all start transforming before another can populate the cache anyway.
envinfo
The text was updated successfully, but these errors were encountered: