Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Do you reimplement the typescript ast? #101

Closed
scottmas opened this issue May 9, 2020 · 8 comments
Closed

Question: Do you reimplement the typescript ast? #101

scottmas opened this issue May 9, 2020 · 8 comments

Comments

@scottmas
Copy link

scottmas commented May 9, 2020

To get your blazing speeds, did you reverse engineer the typescript parser in golang? If so it seems crazy to me that this works! In either case, nice job on this library.

@evanw
Copy link
Owner

evanw commented May 9, 2020

Sort of. I wrote my own code that parses TypeScript, but it discards the information about type annotations instead of creating AST nodes for them. The information isn't needed to convert TypeScript to JavaScript and it's more efficient to just treat type annotations the same as whitespace.

TypeScript-only syntax such as enum and namespace does end up in the final output but the parser converts it to JavaScript before it's done parsing, so the only thing that comes out of the parser is JavaScript.

It was pretty crazy though. I unfortunately had to add backtracking to handle certain constructs. I'll also have to keep it updated as the TypeScript project adds new type system syntax, which luckily happens pretty slowly. Here were some of the most complicated parts:

// This is a spot where the TypeScript grammar is highly ambiguous. Here are
// some cases that are valid:
//
// let x = (y: any): (() => {}) => { };
// let x = (y: any): () => {} => { };
// let x = (y: any): (y) => {} => { };
// let x = (y: any): (y[]) => {};
// let x = (y: any): (a | b) => {};
//
// Here are some cases that aren't valid:
//
// let x = (y: any): (y) => {};
// let x = (y: any): (y) => {return 0};
// let x = (y: any): asserts y is (y) => {};
//

// This is a very complicated and highly ambiguous area of TypeScript
// syntax. Many similar-looking things are overloaded.
//
// TS:
//
// A type cast:
// <A>(x)
// <[]>(x)
// <A[]>(x)
//
// An arrow function with type parameters:
// <A>(x) => {}
// <A, B>(x) => {}
// <A = B>(x) => {}
// <A extends B>(x) => {}
//
// TSX:
//
// A JSX element:
// <A>(x) => {}</A>
// <A extends>(x) => {}</A>
// <A extends={false}>(x) => {}</A>
//
// An arrow function with type parameters:
// <A, B>(x) => {}
// <A extends B>(x) => {}
//
// A syntax error:
// <[]>(x)
// <A[]>(x)
// <A>(x) => {}
// <A = B>(x) => {}

@scottmas
Copy link
Author

Wow, that is crazy. So basically, you wrote an entire JS/TS parser from scratch, albeit a bit lossy since the resulting AST only records tokens that will actually end up in the generated js bundle.

Another question: is it a non-goal of esbuild to support golang plugins that can "hook" into the AST before the JS bundle is actually generated? Or to have any sort of plug and play architecture? Babel for example (according to my understanding) has just become a glorified AST specification with a million different plugins and a couple "official" plugins.

@evanw
Copy link
Owner

evanw commented May 10, 2020

Yes, that's a non-goal. The AST format is currently an internal implementation detail. AST transformation passes are all merged together into as few passes as possible, which is great for speed but not for a plug-and-play architecture.

The goals of this project are roughly a) speed and correctness, b) build a real bundler, and c) inspire the community to build faster tools. It's hard enough building a working bundler from scratch with all of these features. It would be a distraction from these goals to also attempt to build a novel plug-and-play architecture with abstractions that enable a large community-run project at the same time, especially since doing that often comes at the expense of performance.

Perhaps once this project is more mature it can then be factored into something more like that, or another project that's more like that architecture can be started. Right now esbuild is still an experiment.

@scottmas
Copy link
Author

That’s awesome. After playing around with Figma and seeing the black magic that enabled it to work in the browser, you were already a legend. This project just cements your legendary status in my mind. Thanks so much for taking the time to respond to these questions.

@kazzkiq
Copy link

kazzkiq commented May 10, 2020

Just out of curiosity. Does that means that esbuild would not throw type errors? e.g.

function greetings(name: string) {
  return `hello, ${name}`;
}

greetings(42); // would throw type error if compiled with tsc

@evanw
Copy link
Owner

evanw commented May 11, 2020

Does that means that esbuild would not throw type errors?

Yes, for a few reasons:

  • The type checker is where most of the work on the TypeScript compiler goes. It's much harder to port than just the transpiler because it changes all the time, whereas new syntax features are only added rarely (roughly at the same speed as JavaScript) so it's much easier to only handle TypeScript syntax.

  • All bundlers that care about speed won't do type checking because it's not relevant to the goal of producing a JavaScript bundle as fast as possible. Even if you wanted to check types, for maximal speed you'd want type checking to run completely in parallel since the two tasks are disjoint. So it's basically a completely independent project from a bundler.

@evanw
Copy link
Owner

evanw commented May 11, 2020

I've thought a bit about how one might do this because their type checker is really slow, and it definitely breaks my flow sometimes. It's especially annoying to not be able to see type errors in files you don't currently have open in their IDE, which I assume is turned off because it's too slow.

A somewhat straightforward way to speed it up without the blessing of the TypeScript team and without signing up for manually porting every commit would be to come up with an automated porting process to a more efficient language. This would likely only be a moderate speedup because it's still the same single-threaded code, and you'd still have to handle a lot of JavaScript's quirks that slow it down. But you could probably cut some corners and special-case the conversion for the type of JavaScript that the TypeScript compiler team writes.

There may be some things about the compiler code that using a more efficient language could speed up, even if you don't rewrite their code. For example, I think the TypeScript compiler makes pretty heavy use of megamorphic object shapes for its AST nodes, which is a somewhat of a performance anti-pattern for modern JavaScript JITs with inline caches. There are other languages that don't have that problem. A language with an optimizing compiler may also be able to do certain optimizations such as method devirtualization and function inlining that the JavaScript JIT can't do in certain cases. These performance improvements are usually hard to notice because it's death-by-a-thousand-cuts everywhere, but they can be a significant performance improvement (potentially 2x faster?). Here is some proof that wins like this exist: microsoft/TypeScript#39247.

I say this with the caveat that I haven't extensively profiled the TypeScript compiler itself, but I do have extensive experience with writing a web programming language with an optimizing compiler that had to deal with similar issues.

But the real way to speed up the TypeScript compiler would be to get the TypeScript compiler team itself on board and motivated about getting major performance improvements in their compiler (major enough to consider either porting to a native language or writing their TypeScript differently to make automatic porting easier). That's going to be more sustainable and could also lead to bigger wins since you could potentially make algorithmic speedups, similar to what esbuild does for JavaScript bundling. But I assume that's never going to happen because the TypeScript compiler project is very mature at this point. I'm guessing their type checking algorithm is also not straightforward to parallelize so it may not be possible to go faster on multi-core machines.

So it would be an interesting project to tackle for sure! I might attempt it some day. But it's a separate project from esbuild.

@tooolbox
Copy link
Contributor

I just have to note that your megamorphic object shapes link was enlightening, educational, and highly entertaining. (eagle to mother goose unknown shape spotted operation DEOPT is a go)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants