Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollup config hashing for cache is very slow when rebuilding in watch mode #443

Closed
Hental opened this issue Apr 8, 2023 · 10 comments · Fixed by #452
Closed

Rollup config hashing for cache is very slow when rebuilding in watch mode #443

Hental opened this issue Apr 8, 2023 · 10 comments · Fixed by #452
Labels
kind: optimization Performance, space, size, etc improvement scope: cache Related to the cache scope: watch mode Related to Rollup's watch mode solution: workaround available There is a workaround available for this issue

Comments

@Hental
Copy link

Hental commented Apr 8, 2023

Troubleshooting

  1. Does tsc have the same output? If so, please explain why this is incorrect behavior

  2. Does your Rollup plugin order match this plugin's compatibility? If not, please elaborate

  3. Can you create a minimal example that reproduces this behavior? Preferably, use this environment for your reproduction

What happens and why it is incorrect

Environment

Versions

System:
    OS: Linux 4.19 Debian GNU/Linux 10 (buster) 10 (buster)
    CPU: (8) x64 Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz
    Memory: 10.49 GB / 15.41 GB
    Container: Yes
    Shell: 5.7.1 - /usr/bin/zsh
 Binaries:
    Node: 16.19.0 - /run/user/1001/fnm_multishells/846809_1680868810665/bin/node
    Yarn: 1.22.19 - /run/user/1001/fnm_multishells/846809_1680868810665/bin/yarn
    npm: 8.19.3 - /run/user/1001/fnm_multishells/846809_1680868810665/bin/npm
  npmPackages:
    typescript: ^5.0.2 => 5.0.2 

rollup.config.js

:
import path from 'path';
import { defineConfig } from 'rollup';

import { genPlugin, genTypeDefinePlugin } from './plugins';
import { resolveFile, getPackageJson } from './utils';

const pkg = getPackageJson();
const libName = pkg.name;

const devPlugins = genPlugin({
  isDev: true,
  pluginTypescriptOptions: {
    sourceMap: true,
  },
});

/** @type {import('rollup').OutputOptions[]} */
const output = [
  // {
  //   file: resolveFile(`dist/${libName}.cjs.js`),
  //   format: 'cjs',
  // },
  {
    file: resolveFile(`dist/${libName}.js`),
    format: 'es',
  },
];

// build config for development
export default defineConfig([
  {
    watch: {
      include: ['src/**'],
    },
    input: resolveFile('src/index.ts'),
    output,
    plugins: devPlugins,
    external: [...Object.keys(pkg.dependencies), 'protobufjs/minimal'],
  },
]);

tsconfig.json

:
{
  "compilerOptions": {
    "moduleResolution": "node",
    "module": "ESNext",
    "target": "ES2020",
    "lib": ["dom", "ESNext"],
    "sourceMap": true,
    "declaration": true,
    "declarationMap": true,
    "declarationDir": "./type",
    "outDir": "./lib",
    "strict": true,
    "noImplicitAny": true,
    "allowSyntheticDefaultImports": true,
    "strictPropertyInitialization": false,
    "emitDecoratorMetadata": true,
    "experimentalDecorators": true,
    "baseUrl": ".",
    "paths": {
      "@/*": ["src/*"]
    },
    "plugins": [
      {
        "transform": "typescript-transform-paths",
        "useRootDirs": true,
        "afterDeclarations": true
      }
    ]
  },
  "include": ["src/**/*.ts", "src/**/*.d.ts"]
}

package.json

:
 "rollup-plugin-typescript2": "^0.34.1",

plugin output with verbosity 3

:

detail

When rebuilding, rollup-plugin-typescript2 computes a hash key for the cache. One portion of the hash is the Rollup config, which is actually so big of an object that it is a significant performance degradation to compute its hash.

Referencing this part of code:

this.cacheDir = `${this.cacheRoot}/${this.cachePrefix}${objHash(
{
version: this.cacheVersion,
rootFilenames,
options: this.options,
rollupConfig: this.rollupConfig,
tsVersion: tsModule.version,
},
this.hashOptions,
)}`;

In the profiling below, we can see that computing the hash compute takes up almost all of the time, about 20s:

image

image

@Hental
Copy link
Author

Hental commented Apr 8, 2023

can we ignore rollup config or only use part fields when we compute hash

@ezolenko
Copy link
Owner

ezolenko commented Apr 8, 2023

That's interesting... Really we don't need rollup config in the hash in watch mode, since it can't change there, but such hash would be incompatible with normal builds where we do want rollup config in case it changes...

We can either use a separate hash strategy in watch mode (with a second set of cache artifacts), or a better way would be to pre-hash rollup config and only regenerate that in watch mode if config file changes (if we can/need to detect that).

The question is, what does rollup do to handle config changes in watch mode? Does it rebuild everything? Or ignores any changes?

@ezolenko ezolenko added kind: optimization Performance, space, size, etc improvement scope: cache Related to the cache scope: watch mode Related to Rollup's watch mode labels Jun 20, 2023
@ezolenko
Copy link
Owner

Workaround, use clean: true in watch mode

@ezolenko ezolenko added the solution: workaround available There is a workaround available for this issue label Jun 20, 2023
@ezolenko
Copy link
Owner

ezolenko commented Jun 23, 2023

Cache is no longer used during watch mode in 0.35.0 now

@agilgur5 agilgur5 changed the title cache use hash cost last of time ,cause it too slow when rebuild in watch mode Rollup config hashing for cache is very slow when rebuilding in watch mode Jun 30, 2023
@agilgur5
Copy link
Collaborator

agilgur5 commented Jun 30, 2023

Thanks for providing some profiling! I knew the Rollup config hashing was slow and buggy (see also #228) but hadn't ever seen it be that slow 😬

Resolving the root cause here is a bit complicated (as #228 alludes to), as a change in any part of the config could result in a different output. Conservatively busting the cache on any change is safer. Limiting which parts of the config are cached could make the cache busting more inaccurate, resulting in more stale results

The cache is buggy and slow enough (especially as it rolls too often) though that I wonder if we should just make it an opt-in experimental feature instead. It has enough value for larger use-cases that I'm hesitant to remove it wholesale, but it seems to be more problematic than useful on average.

@ezolenko
Copy link
Owner

Yeah, I'm leaning towards removing or reworking cache too (maybe implementing caching DocumentRegistry). I've recently seen some errors being hidden until a clean build in my own projects.

@agilgur5
Copy link
Collaborator

(maybe implementing caching DocumentRegistry)

Curious what you mean by this? As in a custom implementation with filesystem caching? Something like that is mentioned in the TS Compiler docs, but never thought about implementing that 👀 I wonder how much that would increase performance 🤔

In #388 I did make the optimization for DocumentRegistry for watch mode specifically; as far as I know by its usage and docs, the built-in one is only an in-memory cache.

The dependency graph rpt2 has sounds more advanced in theory (but in practice could be less efficient), and I think something like it will still be necessary to implement shouldTransformCachedModule

@agilgur5
Copy link
Collaborator

Thinking back, I did at one point consider splitting out the cache implementation into its own library. That way different implementations could be plugged in and could be shared with other TS libraries.
At the time though, IIRC, parts of the code were a bit tied to the rpt2 implementation (the callbacks, for instance). That and I wasn't confident enough in its correctness to externalize it 😅 (could also more easily robustly test correctness if it were its own library, that being said)

@ezolenko
Copy link
Owner

Yeah, I expect typescript to have its own cache busting logic for document registry (based on compiler options and other parameters some DocumentRegistry calls take I assume), and that might handle some of the dependency tree logic we are trying to build in cache, but rollup potentially adds other reasons to retranspile by changing plugins that affect imports, so getting away from monitoring rollup config changes would be hard anyway...

@ezolenko
Copy link
Owner

Not sure if moving cache into its own library would give us much, since I expect most bugs to be related to failing to detect reasons to break cache and that requires full integration into pipeline. Unless we can make it a separate rollup plugin somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: optimization Performance, space, size, etc improvement scope: cache Related to the cache scope: watch mode Related to Rollup's watch mode solution: workaround available There is a workaround available for this issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants