Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Influence chunking #3520

Closed
LarsDenBakker opened this issue Apr 28, 2020 · 20 comments · Fixed by #3542
Closed

Influence chunking #3520

LarsDenBakker opened this issue Apr 28, 2020 · 20 comments · Fixed by #3542
Assignees

Comments

@LarsDenBakker
Copy link
Contributor

LarsDenBakker commented Apr 28, 2020

Feature Use Case

I am migrating a large web application from webpack to rollup. In this application we use a translation system based on es modules. The application is built out of 100s of components, spread across many NPM packages.

Each component fetches it's translations(texts, labels etc.) with a pattern like this:

switch (document.documentElement.lang) {
  case 'en-GB': return import('./translations/en-GB.js');
  case 'nl-NL': return import('./translations/nl-NL.js');
  case 'nl-BE': return import('./translations/nl-BE');
  ...etc.
}

In terms of code maintenance this is great, each component fetches it's own translations based on the language of the user and we never load any unnecessary code. This does lead to a lot of parallel dynamic import requests at runtime, even after bundling.

In webpack I wrote a plugin which looks at the chunking graph, and then as components are bundled together in chunks it also groups together the translations of the same language to match the chunks of the components. The effect is that for a given chunk, only one extra chunk of translations is dynamically loaded.

I'm looking for a way to implement this in Rollup, but I've not been able to do this. The closest is to use manualChunks, but there I don't have any information about the chunking done by rollup. The best I can do there is group based on node_modules package.

Feature Proposal

Add the ability for plugins to inspect the chunk graph and rearrange modules and chunks.

@lukastaegert
Copy link
Member

I am not sure I understand what exactly you need from Rollup’s chunking to do what you want. Don’t you know from the file name which language a translation belongs to? The thing is, I have long been considering adding such an API but lately I had some ideas how we could make statement level chunking work. If at some point I want to add this, it will mean that parts of a module could end up in different chunks. Not sure how such an API would reflect this. So I would rather like to explore your problem a little more to understand what is missing for you about manual chunks.

@LarsDenBakker
Copy link
Contributor Author

LarsDenBakker commented Apr 28, 2020

I've gathered a simplified example here: https://repl.it/repls/TealComposedDistributeddatabase

It's an application that consists of two pages, A and B. Page A depends on components A and B. Page B depends on component C.

In the final rollup output the code for components A and B are merged with Page A, while component C is merged with page B. That's great!

But there is a separate chunk for each language and each component:

  • component-a-nl-NL.js
  • component-a-en-GB.js
  • component-b-nl-NL.js
  • component-b-en-GB.js
  • component-c-nl-NL.js
  • component-c-en-GB.js

Since I know for sure component A and B are always loaded together, I want to group their translations and avoid loading multiple chunks in parallel:

  • component-a+component-b-nl-NL.js
  • component-a+component-b-en-GB.js
  • component-c-nl-NL.js
  • component-c-en-GB.js

I don't know upfront which pages use which components, and when multiple pages use the same components there will be more fine grained chunking. This knowledge is only known I think at the end of the build phase, so I cannot create manual chunks for this.

@lukastaegert
Copy link
Member

I see, but I am not sure you actually need the chunk graph to make your decision, I would guess you could also do the grouping based on the module graph? Or how do you identify if a translation belongs to a certain component?

I am asking because it would not be too difficult to make the module graph available e.g. for a manualChunks function. Or alternatively extend the emitFile API to emit manual chunks. This would also not collide with any bigger changes to Rollup's own chunking as outlined above.

@LarsDenBakker
Copy link
Contributor Author

The grouping is based on the chunks. For a given chunk I check the dynamic imports and group together any dynamic imports for translations of the same language.

Adding the module graph to manualChunks might make this possible. If I can know which modules are entrypoints and dynamic entrypoints, I can follow all the static imports and find out which dynamic imports are made. This way manualChunks becomes a really nice API for influencing the chunking of rollup.

@LarsDenBakker
Copy link
Contributor Author

LarsDenBakker commented May 1, 2020

I've been playing around with the existing hooks, and I have some ideas to make it work. For each entrypoint I could emit a chunk which imports all the translations required for a locale for that module and it's static dependencies.

Biggest hurdle right now is that when I'm in the load hook for that emitted chunk, I have no idea when the module I emitted this chunk for has finished resolving all of it's dependencies so that I can use getModuleInfo and traverse the module graph. If I just wait a couple of seconds everything I need is there.

Perhaps a beforeBuildEnd hook where we could still emit chunks would be something that can work?

@lukastaegert
Copy link
Member

lukastaegert commented May 1, 2020

Probably, but I'm not sure this will be the result you want. If you emit chunks, it does not mean all your translation will end up in this chunk unless you can also make it a manual chunk, which is currently not possible via the plugin api (but of course you could feed your results from a plugin into a manualChunks function, but this seems slightly convoluted to me).
On the other hand as I understand, all your translations are already part of the module graph, you just want to group them correctly. So a pure chunking solution would still be best, wouldn't it? The thing is, chunks are only created after the module graph is complete. So we could give the complete module graph info to a manualChunks function.

At the easiest, we could extend manualChunks like this:

{ [chunkAlias: string]: string[] } |
(
  (
    id: string,
    {moduleIds, getModuleInfo}
  ) => string | void
)

where moduleIds and getModuleInfo would be the same as on the plugin context. Would this go into the right direction?

@LarsDenBakker
Copy link
Contributor Author

For the chunking that's definitely the right direction. But one thing I realized is that even when creating manual chunks for the translations, it won't reduce the amount of files down the wire because for each dynamic point a tiny module remains that reexports from the manual chunk. That's why I was also exploring emiting a file during load/transform, so that the dynamic imports can be rewired to this new emitted file. 🤔

@lukastaegert
Copy link
Member

Actually I am currently working on a better solution that will get rid of the facades for dynamic imports by creating inline namespace objects if necessary. Will take a few more days, though, as this is a non-trivial change unfortunately that touches a lot of places 🙄

So I focus on that first, and then maybe the manual chunk will look much better.

@LarsDenBakker
Copy link
Contributor Author

Thats awesome to hear! Then it will work great with manual chunks.

@keithamus
Copy link
Contributor

Just going to chime in with a "+1" to removing dynamic import facades. Rewiring dynamic imports to the correct (manual) chunks without any intermediary files would be a huge win for us!

@lukastaegert
Copy link
Member

Working draft PR here: #3535

@LarsDenBakker
Copy link
Contributor Author

Looks good so far 👍

@LarsDenBakker
Copy link
Contributor Author

I tried #3535 on my application, using manualChunks to force all files for a specific locale into a single chunk.

Before the change, my app produces 3304 chunks without manual chunks and 3372 chunks with manual chunks. (It's more chunks because splitting manually on locale alone doesn't account for shared code).

After the change, my app produces 3288 chunks without manual chunks, and 1942 chunks with manual chunks.

That's a massive improvement 👍 Forcing all locales translations into a single chunk is not ideal, having the module graph info will be great for optimizing that further.

@lukastaegert
Copy link
Member

Oh my god, what is that monster 😱
But awesome result! I will see I can have a look at extending the API soon.

@keithamus
Copy link
Contributor

@LarsDenBakker I don't know your setup, so this may be what you're already doing; but can't you chunk locales based on the locale name? As in a function like manualChunks(id) { return (id.match(/\/translations\/([^\.]+).js$/)||[])[1] } would provide you with one chunk for each locale.

@LarsDenBakker
Copy link
Contributor Author

@lukastaegert big enterprise application :) optimized with lots and lots of dynamic imports

@keithamus Yes that's what I'm doing, but I don't want all translations in one chunk I want them to follow the module graph for my application. For example when I navigate to a page, that's when it should fetch the translations for this page and the components it uses.

@lukastaegert
Copy link
Member

Here is a draft PR for the extended API, please give it a spin and tell me if this helps you! #3542

@LarsDenBakker
Copy link
Contributor Author

LarsDenBakker commented May 8, 2020

Works perfectly! On the first call togetManualChunks I build up up a dependency graph with https://github.com/jriecken/dependency-graph and figure out an efficient way to merge the translations.

I think this is really a great enhancement to the manual chunking. It would be nice to be able to use this from a plugin, though that may potentially be dangerous?

One suggestion is to call entryModuleIds and moduleIds, getEntryModuleIds and getModuleIds instead. That makes it a bit clearer.

@lukastaegert
Copy link
Member

lukastaegert commented May 8, 2020

One suggestion is to call entryModuleIds and moduleIds, getEntryModuleIds and getModuleIds instead. That makes it a bit clearer.

Makes sense.

On the first call to getManualChunks I build up up a dependency graph

Admittedly, this is the only way you can achieve what you want right now. Thinking about how to make this API more powerful so that you could work more declaratively, I think I will see if I can include the inverse dependency links into getModuleInfo as well, i.e.

{
  id: string,
  isEntry: boolean, 
  isExternal: boolean,
  hasModuleSideEffects: boolean,
  importedIds: string[],
  dynamicallyImportedIds: string[],

  // inverse links
  importedBy: string[],
  dynamicallyImportedBy: string[]
}

Then you could do something like this:

manualChunks(id, {getModuleInfo}) {
  if (isTranslationFile(id)) {
    const dependentEntryPoints = [];

    // we use a Set here so we handle each module at most once, which
    // prevents infinite loops in case of circular dependencies
    const idsToHandle = new Set(getModuleInfo(id).importedBy);

    for (const moduleId of idsToHandle) {
      const {isEntry, importedBy} = getModuleInfo(moduleId);
      if (isEntry) dependentEntryPoints.push(moduleId);

      // The Set iterator is intelligent enough to iterate over elements that
      // are added during iteration
      for (const importerId of importedBy) idsToHandle.add(importerId);
    }

    // At this point `dependentEntryPoints` contains all entries that depend on this
    // translation and we can put it into the corresponding manual chunk
  }
}

which has the advantage of having no state outside the function.

@lukastaegert
Copy link
Member

@LarsDenBakker I updated the PR to contain the reverse dependencies, though with slightly changed names. I also updated the PR description with the example, I think I will try to put that into the documentation as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants