Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make swc compile faster. #7071

Closed
Boshen opened this issue Mar 13, 2023 · 34 comments · Fixed by #8485
Closed

Make swc compile faster. #7071

Boshen opened this issue Mar 13, 2023 · 34 comments · Fixed by #8485
Assignees
Milestone

Comments

@Boshen
Copy link

Boshen commented Mar 13, 2023

Describe the feature

Some of the crates compile ridiculously slow. It would be really nice if we can speed it up a bit.

From cargo build --release --timings:

ade068e1-fa5e-49c3-b1ec-11e14f4767df

Additional context

Cross link: web-infra-dev/rspack#2202


Update:

swc_ecma_visit and swc_css_visit is blocking compilation for a whole minute due to heavy usage of macros. See https://github.com/swc-project/swc/blob/main/crates/swc_ecma_visit/src/lib.rs

@Boshen Boshen changed the title Make swc_ecma_minifier compile faster. Make swc compile faster. Mar 14, 2023
@Boshen
Copy link
Author

Boshen commented Mar 14, 2023

The culprit are these three blocking everything, and they take a minute each to compile:

image

I think the macros inside these crates should be done in a script and then copied out.

@Boshen
Copy link
Author

Boshen commented Mar 14, 2023

Codegen is performing really badly for the minifiers:

image

@Boshen
Copy link
Author

Boshen commented Mar 14, 2023

I wonder if this works: https://github.com/dtolnay/watt

@kdy1
Copy link
Member

kdy1 commented Mar 14, 2023

Are you using codegen-units=1?

@kdy1
Copy link
Member

kdy1 commented Mar 14, 2023

Ah yeah you are using it.

https://github.com/web-infra-dev/rspack/blob/cdf6a52a39f37a8ce2975692b7c46abc729169b6/Cargo.toml#LL12

I think that's the main problem

@Boshen
Copy link
Author

Boshen commented Mar 14, 2023

image

Changing to default codegen-units doesn't help (as seen from above), it's the macros ;-)

@kdy1
Copy link
Member

kdy1 commented Mar 14, 2023

I know. I'm talking about codegen times of transform/minifier crates.

@Boshen
Copy link
Author

Boshen commented Mar 14, 2023

I wonder if the macro implementations are sub-optimal, for example https://users.rust-lang.org/t/5-hours-to-compile-macro-what-can-i-do/36508

@kdy1
Copy link
Member

kdy1 commented Mar 14, 2023

Yeah, they are not optimal. It's a known issue

@Boshen
Copy link
Author

Boshen commented Mar 14, 2023

Assign this to me if you don't have the time, I'll dig deeper.

@kdy1
Copy link
Member

kdy1 commented Mar 14, 2023

This is not a focus, at least at the moment. I have time, but I don't want to use my personal time for this.

You can create fake types like Expr(Tokens) or Stmt(Tokens) to store tokens as-is without parsing from proc macros. But it should be conditional using cfg , so it can be verified by giving a feature flag.

Currently pmutil stores tokens in a parsed form, but Tokens => Expr => Tokens is a waste.

@nnethercote
Copy link

Summoned via https://twitter.com/boshen_c/status/1635842195113787392, I took a look at this.

First, I'll consider the high codegen times for some of the crates.

I have an Intel i9-7940X which has 14 physical cores and 28 virtual cores. My compile times for swc were:

  • cargo check: 1m00s
  • cargo build: 1m19s
  • cargo build --release: 2m19s

So, a pretty small jump going from check to build, but a big jump going from build to build --release.

--timings shows that release builds are spending a huge amount of time in codegen, as mentioned above. Here is some of the --timings graph for a debug build:

debug

And here is the same part for an opt build:

release

Note that the purple (codegen) part is massively bigger in the release build. I've never seen so much purple in a --timings graph. And it only happens in the swc* crates at the bottom of the crate graph. Ones higher up have a much higher blue-to-purple ratio, like I'd expect.

swc_node_bundler is a good example, taking 1.25s in a debug build and 43.13s for a release build, which is a gigantic difference. I tried downloading just that crate from crates.io and compiling it using the rustc-perf benchmark harness. I got reasonably similar results: 0.8s for debug and 33.0s for release.

I then tried profiling the compiler with samply while doing debug and release builds of swc_node_bundler. Here is the thread timeline for a debug build:

debug2

and for an opt build:

release2

rustc is the front-end thread, the other threads are doing codegen. There are multiple codegen threads running in parallel, which suggests that the codegen-units=1 theory from above is incorrect. (Besides, how does rspack even relate to swc?)

For the debug build we have four "opt" threads, which are WorkItem::Optimize units within the compiler. For the release build we have three "opt" threads and eleven "LTO" threads, which are WorkItem::LTO units within the compiler. The "LTO" threads don't start until the "opt" threads finish. The top-most "LTO" thread accounts for 26 seconds of the runtime, running by itself for much of that time, so that seems to be much of the problem.

I don't know much about these "LTO" threads, and why one of them would be so slow. It definitely seems odd. The crate has only 767 lines of Rust code in it, and it looks like very normal, reasonable code. My current theory is that one of the crates that swc_node_bundler depends on is doing something unusual that is causing lots of swc* crates to be so slow to codegen. swc_ecma_ast and swc_ecma_visit look like the ones all the slow-to-compile crates have in common. Interestingly, those are two of the three crates with problematic macros that @Boshen mentioned above.

Ok, there ends part 1 of my analysis.

@Boshen
Copy link
Author

Boshen commented Mar 20, 2023

@nnethercote Thank you so much for looking into this.

how does rspack even relate to swc

The rspack project depends on almost all of swc.

I've never seen so much purple in a --timings graph.

I thought it's normal for codegen to take this much time, apparently it's not 😞 So now we have two problems at hand: macros and codegen.

@kdy1
Copy link
Member

kdy1 commented Mar 20, 2023

Oh...
Interesting. I took the ratio graph granted because swc is my first big rust project, but it was not common 🤣

I know the solution for the proc macro part, and AFAIK the long codegen is caused by visitors.
I profiled it a long time ago, although I didn't use rustc perf tester.

@nnethercote
Copy link

Now for the crates using macros.

swc_ecma_visit-0.86.1

  • Has 10,018 lines of code, but 7,355 of that is generated in target/debug/build/swc_atoms-fa283f5fd94de3ff/out/js_word.rs, mostly for a perfect hash function.
  • cargo expand's output is 84,893 lines of code. That's a lot! Much of that is lots of very large types with many fold and visit operations defined on them.

swc_ecma_ast-0.100.1

  • 14,621 lines of code, but again, 7,355 of that is in target/debug/build/swc_atoms-fa283f5fd94de3ff/out/js_word.rs
  • cargo expand's output is 120,008 lines of code. That's even more! Much of that is serializing/deserializing code generated by serde, which is known to produce verbose code.

I looked at samply profiles of check builds for both of these. The profiles looked pretty normal, which suggests that the code isn't particularly unusual, but just that there's a lot of it.

I think it will take project-specific understanding to improve things. Looking at the output of cargo expand could be helpful. There is so much code there. Is all of it necessary? Could it be made shorter? And maybe reducing the amount of code in those modules might help with the codegen times in later modules.

@nnethercote
Copy link

nnethercote commented Mar 20, 2023

Interesting. I took the ratio graph granted because swc is my first big rust project, but it was not common rofl

That's right. If you look at the earlier swc crates, and the non-swc crates, you can see that the blue and purple lengths are usually fairly similar. (Likewise with all the crates in debug builds.) Sometimes the purple part might be 4 or 5 times longer, which isn't unusual. But ratios like 10, 20, 30 are unusual.

@kdy1
Copy link
Member

kdy1 commented Mar 20, 2023

About macros:

  • swc_ecma_ast:

Macros create an enormous amount of code, and I think they can be reduced a bit, but not by a margin. My main trick for reducing the amount of code will be extracting common code to swc_common or swc_visit.

It includes

  • custom derive for serde

  • many implementation of From<T>

  • derive of rkyv::Archive

  • derive of many built-in traits

  • swc_ecma_visit

The proc-macro generates two kinds of visitors. The first one is general visitors used by swc itself, and the second one is a-path-aware visitors used by rspack and turbopack.

Btw, can #[inline]/generic generated by proc-macro can cause such issues?

@kdy1 kdy1 assigned kdy1 and unassigned Boshen Mar 23, 2023
@kdy1
Copy link
Member

kdy1 commented Mar 23, 2023

I'll work on this

@mischnic
Copy link
Contributor

mischnic commented Mar 24, 2023

cargo expand's output is 120,008 lines of code. That's even more! Much of that is serializing/deserializing code generated by serde, which is known to produce verbose code.

Parcel (and probably also rspack) currently doesn't use serde/rykv for swc ASTs, so maybe compile time could be improved here by putting that behind a cargo feature.

@kdy1
Copy link
Member

kdy1 commented Mar 24, 2023 via email

@kdy1
Copy link
Member

kdy1 commented Mar 27, 2023

Can you try swc_core@v0.70.0?
I made serde of AST optional, and off by default so the compile time should be improved

@Boshen
Copy link
Author

Boshen commented Mar 27, 2023

Rspack is currently stuck on an older version due to #7085, I'll report back the improvements once we upgrade to the latest version when that's fixed on our end.

@mischnic
Copy link
Contributor

Not sure if I did something wrong, but the compiletime didn't get better for Parcel on my machine:

I ran rm -rf target && RUSTC_WRAPPER= yarn workspace @parcel/transformer-js build-release in the root of the repo.

@kdy1
Copy link
Member

kdy1 commented Mar 27, 2023

Oh... interesting.

I ran cargo build --timings and cargo build --timings --release from the repository root of vercel/turbo.
(Also, I ran cargo clean each time, and I did nothing while compiling to reduce noise)

Debug build:

New: Finished dev [unoptimized + debuginfo] target(s) in 2m 15s
Prev: Finished dev [unoptimized + debuginfo] target(s) in 2m 56s

This was the result for turbopack, but this is before AST change

@kdy1 kdy1 added this to the Planned milestone Mar 28, 2023
@Boshen
Copy link
Author

Boshen commented Mar 29, 2023

I found cargo-llvm-lines from matklad's blog post.

This can be used to guide your refactoring, e.g.

cargo llvm-lines -p swc_ecma_parser | head -20

  Lines                 Copies              Function name
  -----                 ------              -------------
  368372                4504                (TOTAL)
   11760 (3.2%,  3.2%)    35 (0.8%,  0.8%)  alloc::raw_vec::RawVec<T,A>::grow_amortized
   11707 (3.2%,  6.4%)    13 (0.3%,  1.1%)  swc_ecma_parser::parser::class_and_fn::<impl swc_ecma_parser::parser::Parser<I>>::parse_fn_args_body::{{closure}}
    9803 (2.7%,  9.0%)     1 (0.0%,  1.1%)  swc_ecma_parser::parser::stmt::module_item::<impl swc_ecma_parser::parser::Parser<I>>::parse_export
    7383 (2.0%, 11.0%)    15 (0.3%,  1.4%)  swc_ecma_parser::parser::typescript::<impl swc_ecma_parser::parser::Parser<I>>::try_parse_ts
    5477 (1.5%, 12.5%)     1 (0.0%,  1.4%)  swc_ecma_parser::parser::stmt::module_item::<impl swc_ecma_parser::parser::Parser<I>>::parse_import

In some other place:

  89536 (11.0%, 11.0%)  1399 (5.3%,  5.3%)  swc_visit::AstNodePath<N>::with
  37854 (4.7%, 15.7%)    701 (2.6%,  7.9%)  swc_visit::AstKindPath<K>::with

In cases where generics can not be removed, https://matklad.github.io/2021/09/04/fast-rust-builds.html#Keeping-Instantiations-In-Check explains the "inner" technique.

@kdy1 kdy1 removed their assignment Mar 29, 2023
@elenakrittik

This comment was marked as spam.

@swc-project swc-project temporarily blocked elenakrittik Jun 2, 2023
@kdy1 kdy1 self-assigned this Aug 17, 2023
@kdy1
Copy link
Member

kdy1 commented Aug 17, 2023

I'll tackle this again in the near future. I want to make ES AST/parser extensible and have related idea, but it will make compilation slower

@kdy1
Copy link
Member

kdy1 commented Sep 3, 2023

I think one of the biggest problems is the lack of parallelism, and wrote #7911

It's RFC, and I want to hear opinions about such CLI tool.

@kdy1
Copy link
Member

kdy1 commented Oct 12, 2023

#8110 should improve compile time a bit.

Also, this is an experiment for turbopack, but depending directly on crates makes compilation faster. vercel/turbo#5879

I'm going to create a CLI tool to manage just as you were using swc_core, while depending directly

@kdy1
Copy link
Member

kdy1 commented Oct 16, 2023

@Boshen @mischnic Can you profile it again, but with swc_core and only with the features you use?

@kdy1
Copy link
Member

kdy1 commented Oct 16, 2023

@elenakrittik

i just want to bring more attention to this issue

I marked it as spam because of this part.

@mischnic
Copy link
Contributor

I ran rm -rf target && RUSTC_WRAPPER= cargo build --timings --release -p parcel-js-swc-core in the Parcel monorepo root

Before: 4m 40s
After bumping swc (with that compat split): 5m 42s

Timings reports: cargo-timings.zip

@kdy1
Copy link
Member

kdy1 commented Oct 16, 2023

image image

Hmm... The result is closer to a noisy neighbor issue IMO, because non-swc crates show too much difference.

kdy1 added a commit that referenced this issue Feb 22, 2024
**Description:**


**Related issue:**

 - Closes #7071.
@kdy1 kdy1 modified the milestones: Planned, v1.4.3 Mar 5, 2024
@swc-bot
Copy link
Collaborator

swc-bot commented Apr 4, 2024

This closed issue has been automatically locked because it had no new activity for a month. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@swc-project swc-project locked as resolved and limited conversation to collaborators Apr 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

Successfully merging a pull request may close this issue.

6 participants