Influence chunking #3520

LarsDenBakker · 2020-04-28T12:00:32Z

Feature Use Case

I am migrating a large web application from webpack to rollup. In this application we use a translation system based on es modules. The application is built out of 100s of components, spread across many NPM packages.

Each component fetches it's translations(texts, labels etc.) with a pattern like this:

switch (document.documentElement.lang) {
  case 'en-GB': return import('./translations/en-GB.js');
  case 'nl-NL': return import('./translations/nl-NL.js');
  case 'nl-BE': return import('./translations/nl-BE');
  ...etc.
}

In terms of code maintenance this is great, each component fetches it's own translations based on the language of the user and we never load any unnecessary code. This does lead to a lot of parallel dynamic import requests at runtime, even after bundling.

In webpack I wrote a plugin which looks at the chunking graph, and then as components are bundled together in chunks it also groups together the translations of the same language to match the chunks of the components. The effect is that for a given chunk, only one extra chunk of translations is dynamically loaded.

I'm looking for a way to implement this in Rollup, but I've not been able to do this. The closest is to use manualChunks, but there I don't have any information about the chunking done by rollup. The best I can do there is group based on node_modules package.

Feature Proposal

Add the ability for plugins to inspect the chunk graph and rearrange modules and chunks.

lukastaegert · 2020-04-28T17:00:30Z

I am not sure I understand what exactly you need from Rollup’s chunking to do what you want. Don’t you know from the file name which language a translation belongs to? The thing is, I have long been considering adding such an API but lately I had some ideas how we could make statement level chunking work. If at some point I want to add this, it will mean that parts of a module could end up in different chunks. Not sure how such an API would reflect this. So I would rather like to explore your problem a little more to understand what is missing for you about manual chunks.

LarsDenBakker · 2020-04-28T17:43:17Z

I've gathered a simplified example here: https://repl.it/repls/TealComposedDistributeddatabase

It's an application that consists of two pages, A and B. Page A depends on components A and B. Page B depends on component C.

In the final rollup output the code for components A and B are merged with Page A, while component C is merged with page B. That's great!

But there is a separate chunk for each language and each component:

component-a-nl-NL.js
component-a-en-GB.js
component-b-nl-NL.js
component-b-en-GB.js
component-c-nl-NL.js
component-c-en-GB.js

Since I know for sure component A and B are always loaded together, I want to group their translations and avoid loading multiple chunks in parallel:

component-a+component-b-nl-NL.js
component-a+component-b-en-GB.js
component-c-nl-NL.js
component-c-en-GB.js

I don't know upfront which pages use which components, and when multiple pages use the same components there will be more fine grained chunking. This knowledge is only known I think at the end of the build phase, so I cannot create manual chunks for this.

lukastaegert · 2020-04-28T18:13:30Z

I see, but I am not sure you actually need the chunk graph to make your decision, I would guess you could also do the grouping based on the module graph? Or how do you identify if a translation belongs to a certain component?

I am asking because it would not be too difficult to make the module graph available e.g. for a manualChunks function. Or alternatively extend the emitFile API to emit manual chunks. This would also not collide with any bigger changes to Rollup's own chunking as outlined above.

LarsDenBakker · 2020-04-28T19:48:06Z

The grouping is based on the chunks. For a given chunk I check the dynamic imports and group together any dynamic imports for translations of the same language.

Adding the module graph to manualChunks might make this possible. If I can know which modules are entrypoints and dynamic entrypoints, I can follow all the static imports and find out which dynamic imports are made. This way manualChunks becomes a really nice API for influencing the chunking of rollup.

LarsDenBakker · 2020-05-01T17:41:50Z

I've been playing around with the existing hooks, and I have some ideas to make it work. For each entrypoint I could emit a chunk which imports all the translations required for a locale for that module and it's static dependencies.

Biggest hurdle right now is that when I'm in the load hook for that emitted chunk, I have no idea when the module I emitted this chunk for has finished resolving all of it's dependencies so that I can use getModuleInfo and traverse the module graph. If I just wait a couple of seconds everything I need is there.

Perhaps a beforeBuildEnd hook where we could still emit chunks would be something that can work?

lukastaegert · 2020-05-01T18:39:55Z

Probably, but I'm not sure this will be the result you want. If you emit chunks, it does not mean all your translation will end up in this chunk unless you can also make it a manual chunk, which is currently not possible via the plugin api (but of course you could feed your results from a plugin into a manualChunks function, but this seems slightly convoluted to me).
On the other hand as I understand, all your translations are already part of the module graph, you just want to group them correctly. So a pure chunking solution would still be best, wouldn't it? The thing is, chunks are only created after the module graph is complete. So we could give the complete module graph info to a manualChunks function.

At the easiest, we could extend manualChunks like this:

{ [chunkAlias: string]: string[] } |
(
  (
    id: string,
    {moduleIds, getModuleInfo}
  ) => string | void
)

where moduleIds and getModuleInfo would be the same as on the plugin context. Would this go into the right direction?

LarsDenBakker · 2020-05-01T20:14:14Z

For the chunking that's definitely the right direction. But one thing I realized is that even when creating manual chunks for the translations, it won't reduce the amount of files down the wire because for each dynamic point a tiny module remains that reexports from the manual chunk. That's why I was also exploring emiting a file during load/transform, so that the dynamic imports can be rewired to this new emitted file. 🤔

lukastaegert · 2020-05-02T04:18:53Z

Actually I am currently working on a better solution that will get rid of the facades for dynamic imports by creating inline namespace objects if necessary. Will take a few more days, though, as this is a non-trivial change unfortunately that touches a lot of places 🙄

So I focus on that first, and then maybe the manual chunk will look much better.

LarsDenBakker · 2020-05-02T07:02:49Z

Thats awesome to hear! Then it will work great with manual chunks.

keithamus · 2020-05-03T18:52:29Z

Just going to chime in with a "+1" to removing dynamic import facades. Rewiring dynamic imports to the correct (manual) chunks without any intermediary files would be a huge win for us!

lukastaegert · 2020-05-05T05:34:10Z

Working draft PR here: #3535

LarsDenBakker · 2020-05-05T16:08:03Z

Looks good so far 👍

LarsDenBakker · 2020-05-07T10:58:00Z

I tried #3535 on my application, using manualChunks to force all files for a specific locale into a single chunk.

Before the change, my app produces 3304 chunks without manual chunks and 3372 chunks with manual chunks. (It's more chunks because splitting manually on locale alone doesn't account for shared code).

After the change, my app produces 3288 chunks without manual chunks, and 1942 chunks with manual chunks.

That's a massive improvement 👍 Forcing all locales translations into a single chunk is not ideal, having the module graph info will be great for optimizing that further.

lukastaegert · 2020-05-07T11:01:16Z

Oh my god, what is that monster 😱
But awesome result! I will see I can have a look at extending the API soon.

keithamus · 2020-05-07T11:46:33Z

@LarsDenBakker I don't know your setup, so this may be what you're already doing; but can't you chunk locales based on the locale name? As in a function like manualChunks(id) { return (id.match(/\/translations\/([^\.]+).js$/)||[])[1] } would provide you with one chunk for each locale.

LarsDenBakker · 2020-05-07T11:56:14Z

@lukastaegert big enterprise application :) optimized with lots and lots of dynamic imports

@keithamus Yes that's what I'm doing, but I don't want all translations in one chunk I want them to follow the module graph for my application. For example when I navigate to a page, that's when it should fetch the translations for this page and the components it uses.

lukastaegert · 2020-05-08T05:08:41Z

Here is a draft PR for the extended API, please give it a spin and tell me if this helps you! #3542

LarsDenBakker · 2020-05-08T12:30:31Z

Works perfectly! On the first call togetManualChunks I build up up a dependency graph with https://github.com/jriecken/dependency-graph and figure out an efficient way to merge the translations.

I think this is really a great enhancement to the manual chunking. It would be nice to be able to use this from a plugin, though that may potentially be dangerous?

One suggestion is to call entryModuleIds and moduleIds, getEntryModuleIds and getModuleIds instead. That makes it a bit clearer.

lukastaegert · 2020-05-08T14:51:13Z

One suggestion is to call entryModuleIds and moduleIds, getEntryModuleIds and getModuleIds instead. That makes it a bit clearer.

Makes sense.

On the first call to getManualChunks I build up up a dependency graph

Admittedly, this is the only way you can achieve what you want right now. Thinking about how to make this API more powerful so that you could work more declaratively, I think I will see if I can include the inverse dependency links into getModuleInfo as well, i.e.

{
  id: string,
  isEntry: boolean, 
  isExternal: boolean,
  hasModuleSideEffects: boolean,
  importedIds: string[],
  dynamicallyImportedIds: string[],

  // inverse links
  importedBy: string[],
  dynamicallyImportedBy: string[]
}

Then you could do something like this:

manualChunks(id, {getModuleInfo}) {
  if (isTranslationFile(id)) {
    const dependentEntryPoints = [];

    // we use a Set here so we handle each module at most once, which
    // prevents infinite loops in case of circular dependencies
    const idsToHandle = new Set(getModuleInfo(id).importedBy);

    for (const moduleId of idsToHandle) {
      const {isEntry, importedBy} = getModuleInfo(moduleId);
      if (isEntry) dependentEntryPoints.push(moduleId);

      // The Set iterator is intelligent enough to iterate over elements that
      // are added during iteration
      for (const importerId of importedBy) idsToHandle.add(importerId);
    }

    // At this point `dependentEntryPoints` contains all entries that depend on this
    // translation and we can put it into the corresponding manual chunk
  }
}

which has the advantage of having no state outside the function.

lukastaegert · 2020-05-09T11:11:34Z

@LarsDenBakker I updated the PR to contain the reverse dependencies, though with slightly changed names. I also updated the PR description with the example, I think I will try to put that into the documentation as well.

LarsDenBakker mentioned this issue Apr 28, 2020

Ability to collect chunk info before or during renderchunk #3519

Closed

lukastaegert added the t³ ✨ enhancement label May 8, 2020

lukastaegert self-assigned this May 8, 2020

lukastaegert mentioned this issue May 8, 2020

Extend manualChunks API #3542

Merged

10 tasks

lukastaegert closed this as completed in #3542 May 10, 2020

LarsDenBakker mentioned this issue Oct 2, 2020

Provide a warning when generating SystemJS chunks with circular references #3801

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Influence chunking #3520

Influence chunking #3520

LarsDenBakker commented Apr 28, 2020 •

edited

lukastaegert commented Apr 28, 2020

LarsDenBakker commented Apr 28, 2020 •

edited

lukastaegert commented Apr 28, 2020

LarsDenBakker commented Apr 28, 2020

LarsDenBakker commented May 1, 2020 •

edited

lukastaegert commented May 1, 2020 •

edited

LarsDenBakker commented May 1, 2020

lukastaegert commented May 2, 2020

LarsDenBakker commented May 2, 2020

keithamus commented May 3, 2020

lukastaegert commented May 5, 2020

LarsDenBakker commented May 5, 2020

LarsDenBakker commented May 7, 2020

lukastaegert commented May 7, 2020

keithamus commented May 7, 2020

LarsDenBakker commented May 7, 2020

lukastaegert commented May 8, 2020

LarsDenBakker commented May 8, 2020 •

edited

lukastaegert commented May 8, 2020 •

edited

lukastaegert commented May 9, 2020

Influence chunking #3520

Influence chunking #3520

Comments

LarsDenBakker commented Apr 28, 2020 • edited

Feature Use Case

Feature Proposal

lukastaegert commented Apr 28, 2020

LarsDenBakker commented Apr 28, 2020 • edited

lukastaegert commented Apr 28, 2020

LarsDenBakker commented Apr 28, 2020

LarsDenBakker commented May 1, 2020 • edited

lukastaegert commented May 1, 2020 • edited

LarsDenBakker commented May 1, 2020

lukastaegert commented May 2, 2020

LarsDenBakker commented May 2, 2020

keithamus commented May 3, 2020

lukastaegert commented May 5, 2020

LarsDenBakker commented May 5, 2020

LarsDenBakker commented May 7, 2020

lukastaegert commented May 7, 2020

keithamus commented May 7, 2020

LarsDenBakker commented May 7, 2020

lukastaegert commented May 8, 2020

LarsDenBakker commented May 8, 2020 • edited

lukastaegert commented May 8, 2020 • edited

lukastaegert commented May 9, 2020

LarsDenBakker commented Apr 28, 2020 •

edited

LarsDenBakker commented Apr 28, 2020 •

edited

LarsDenBakker commented May 1, 2020 •

edited

lukastaegert commented May 1, 2020 •

edited

LarsDenBakker commented May 8, 2020 •

edited

lukastaegert commented May 8, 2020 •

edited