Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental Bundling #6047

Open
AGawrys opened this issue Mar 23, 2021 · 4 comments
Open

Incremental Bundling #6047

AGawrys opened this issue Mar 23, 2021 · 4 comments

Comments

@AGawrys
Copy link
Contributor

AGawrys commented Mar 23, 2021

👊 RFC - Incremental Bundling

Created with: @joeyslater

🔦 Context

Currently, developers working on medium-to-large sized applications wait too long to see code changes reflected while developing with Parcel on the order of seconds to minutes. After building with Parcel initially, making a code change such as a statement or altering an import causes non-negligible rebuilding times; therefore, causing developer experience pain since changes can only be viewed once Parcel has finished the entire lifecycle.

🤔 Case Study

For a medium-sized application, here is a trace that highlights where time is being spent:

medsizeapplication_trace

Area of Parcel Time (s)
bundleRunner.bundle 8.45
bundler.fromAsset 0.42
bundler.bundle 5.27
bundler.step1 2.77
bundler.step2 0.39
bundler.deduplicate 1.93
bundler.step4 0.26
bundler.renameBundlers 0.06
applyRuntimes 1.04
packageRunner.writeBundle 7.15
packager.package 1.96 s (thread 1) - starts about 2 seconds in, write to cache : .21 s
0.25 s (thread 2) - starts about 3.5 seconds in, write to cache : .05 s
packager.writeToDist 1.44 s total, occurs 369 times
Total: 15.6

Almost half of rebuilding takes place in bundling, in which we’ve identified redundant building of our Bundle Graph. 70% of bundling is spent rebuilding / packaging a mutable graph that could be mutated instead. The most impact can be made by prioritizing reduction of redundant calculation in build. Success would be a time improvement in build times for developers upwards of 5s

✏️ Proposal

rebundling

(Open in new window)

Incremental Bundling

Altering the bundling process requires a rewrite of the bundler as well as changes to the API. However, our changes may be localized by exploiting the plugin system Parcel already has in place.

High-level updates to the bundler

  • Using changedAssets to take those assets, find where these assets are being used in the bundle phase and update from there.
    • Updating the API will allow custom bundlers to create their own Iterative Bundler plugin if needed.
  • The BundleGraph is persisted from the cache when zero changes take place; start with that cached graph regardless of what changes were made.
  • Have a separate IterativeBundler (or DefaultBundler handles the initial and the iterative) if changedAssets does not include the entire AssetGraph.
  • Traverse the cached BundleGraph starting from the root and perform a re-bundle from that point
    • For simple changes (not importing or exporting), these bundles should not have new edges.
    • For new edges, the bundler will need to figure out how these new edges & nodes will impact the bundle graph.
    • Changes may not be one file at a time, ie. applying a patch or save all in an IDE. Will need to test these cases.
    • Facebook Metro calculates the Delta (through the changes of modified / deleted modules). It traverses what is needed, instead of the full tree.
    • Note: Marking as questionable, probably part of the iterative development process.
  • Selectively deduplicate if needed
    • Production builds do not deduplicate in the optimize stage. Perhaps, the changes may not need to be deduplicated. For example, making a change that does impact asset graph edges / relationships may not need to be deduplicated.
  • Add an additional return to the bundler, changedBundles, to give packager an opportunity to only package what’s been updated.

Incremental Packaging

Ideally this could be a result of incremental bundling, since for a medium sized application, writing bundles takes almost as much time as bundling.

Once incremental bundling is implemented, the bundler should be able to provide changedBundles (similar to changedAssets), the packaging phase can target only those bundles the were updates.

Since packager accepts bundles, Parcel will only pass changed bundles to the Packager.


export default new Packager({
  async package({ bundle }) {
    // ...
    return { contents, map };
  },
}); 

🦋 Updates and Current Implementation

The current prototype isolates a user’s transformation (e.g. adding a dependency) , when --incremental flag is enabled, and then uses that subgraph (an AssetGraph of "islands") in addition to the previous cached bundle graph to produce an updated BundleGraph.

The TransformationGraph is merged with the cached BundleGraph, and the update() function takes in the merged graph and updates all relevant bundles.

✨ This method has decreased the build time on save by ~47% on a larger scale application.

Time (s)
Mean p50 p90
Baseline 16.17 15.51 16.82
With Incremental 8.54 8.50 8.36

Diagrams

This diagram shows the TransformationSubgraph for adding dependency bar to foo. It is used in place of the AssetGraph in the Bundling phase.

AssetGraph_Transformation_1

Here's a high level overview of the general logic.

incrementalBundlingfunctionchanges

🧷 Current Concerns and Q's

  1. With this method, merge() takes on another responsibility besides being used at applyRuntimes
  2. Bundler API is altered, and users would have the methods to remove and add assets to the BundleGraph
  3. This is still a prototype, so removal of assets and certain edge cases have not been accounted for, nor tested yet

🎈 Risks

  1. Breaking changes (API contract changes) versus v2 stable roll out
    1. Option 1: Parcel API is not stable for the stable release
    2. Option 2: Would be released under a new major release
  2. Would an iterative approach to bundling create different bundle / structure than a build?
    1. The dev/prod modes already produce different results as the final deduplication is not run in dev modeRelease/Rollout

🥉 Other Relevant Efforts

Update to data structures / types

  • Serialization / deserialization and write actions can be optimized for large applications (scaling per amount of needed assets and dependencies).
  • Using a more efficient graph. Moving edges / nodes to be represented by nodes could better represented since, with medium to large sized applications, the amount of edges & nodes is not negligible. Applications at scale will benefit from both the spacial & time savings.
  • Moving string types to enums; i.e. edge types.
  • Potentially move to non-Javascript objects for some items (similar to Metro) where data is represented as an array instead of data object. An example would be be edge which is (to, from, type) to [to, from, type].
  • Updating string identifiers to integers (Changing graph identifiers from content keys to numbers #6000) -- Probably the more straightforward, 32 bytes per UUID string vs max of 8 bytes for integer will be a compounded savings as each identifier is represented multiple times throughout the graph.

Before

Graph Type Nodes Edges Serialize w/ Write Deserialize w/ Read Serialized Memory Total Cold Build
BundleGraph 48348 127683 1861 ms 2509 ms 68.96 mb 741.96s
RequestGraph 92336 715971 11133 ms 5102 ms 188.32 mb

With Numeric Ids

Graph Type Nodes Edges Serialize w/ Write Deserialize w/ Read Serialized Memory Total Cold Build
BundleGraph 48348 127683 1361 ms (-26%) 1758 ms (-30%) 59.24 mb (-14%) 588.86s (-20%)
RequestGraph 92344 716001 7713 ms (-31%) 2996 ms (-41%) 102.43 mb (-45%)

Port the bundler (and other areas of Core) to Rust LONGTERM

  • Eventually, porting the bundler would be better optimized with Rust’s memory management and access to parallelism. Using SWC as an example of moving from Babel to a Rust-based compiler saw significant time-savings.
  • Idea to be expanded on post Incremental bundling
@devongovett
Copy link
Member

Excited for this! I think perhaps we could do something like this for the Bundler API, rather than having two separate bundler plugins, which seems harder to fit into the existing plugin system:

export default new Bundler({
  bundle({bundleGraph}) {
    // perform initial bundling
  },
  optimize({bundleGraph}) {
    // perform initial optimization
  },
  update({bundleGraph, changedAssets}) {
    // incrementally update cached bundle graph
  }
});

One other question I had would be whether you think incremental bundling will occur only for dev builds or for cached prod builds as well? In that case, we may need another method to update an optimized build, or perhaps just a shouldOptimize flag passed to update?

Alternatively, we could keep only the bundle and optimize methods, and pass an isIncremental flag to each rather than having a separate update method? That would be more of a breaking API change though, whereas update is additive. What do you think?

@joeyslater
Copy link
Contributor

joeyslater commented Mar 23, 2021

I think our initial though process was to flag incremental bundling, which I believe should work for both options. You are absolutely right about the additive being more preferred since we could potentially flag that similar to how we flag optimize. I think our concern was trying to minimize the amount risk to the default bundler and do this "incrementally (ha)", but it's likely improbable.

Prod builds would be interesting, especially when we move to relative pathing (I like the idea of giving someone the parcel cache then re-building).

@jondlm
Copy link
Contributor

jondlm commented Jun 5, 2023

Has there been any movement on this one? I'm taking a closer look at Parcel for some of our large codebases and am really curious about this architectural shift.

@mischnic
Copy link
Member

mischnic commented Jun 5, 2023

One related PR was #6514 (where bundling is skipped completely if you didn't change any imports or exports).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants