Question: Do you reimplement the typescript ast? #101

scottmas · 2020-05-09T21:35:23Z

To get your blazing speeds, did you reverse engineer the typescript parser in golang? If so it seems crazy to me that this works! In either case, nice job on this library.

evanw · 2020-05-09T21:46:08Z

Sort of. I wrote my own code that parses TypeScript, but it discards the information about type annotations instead of creating AST nodes for them. The information isn't needed to convert TypeScript to JavaScript and it's more efficient to just treat type annotations the same as whitespace.

TypeScript-only syntax such as enum and namespace does end up in the final output but the parser converts it to JavaScript before it's done parsing, so the only thing that comes out of the parser is JavaScript.

It was pretty crazy though. I unfortunately had to add backtracking to handle certain constructs. I'll also have to keep it updated as the TypeScript project adds new type system syntax, which luckily happens pretty slowly. Here were some of the most complicated parts:

esbuild/internal/parser/parser.go

Lines 713 to 727 in 8d93493

    
           // This is a spot where the TypeScript grammar is highly ambiguous. Here are 
        
           // some cases that are valid: 
        
           // 
        
           //     let x = (y: any): (() => {}) => { }; 
        
           //     let x = (y: any): () => {} => { }; 
        
           //     let x = (y: any): (y) => {} => { }; 
        
           //     let x = (y: any): (y[]) => {}; 
        
           //     let x = (y: any): (a | b) => {}; 
        
           // 
        
           // Here are some cases that aren't valid: 
        
           // 
        
           //     let x = (y: any): (y) => {}; 
        
           //     let x = (y: any): (y) => {return 0}; 
        
           //     let x = (y: any): asserts y is (y) => {}; 
        
           //

esbuild/internal/parser/parser.go

Lines 2347 to 2378 in 8d93493

    
           // This is a very complicated and highly ambiguous area of TypeScript 
        
           // syntax. Many similar-looking things are overloaded. 
        
           // 
        
           // TS: 
        
           // 
        
           //   A type cast: 
        
           //     <A>(x) 
        
           //     <[]>(x) 
        
           //     <A[]>(x) 
        
           // 
        
           //   An arrow function with type parameters: 
        
           //     <A>(x) => {} 
        
           //     <A, B>(x) => {} 
        
           //     <A = B>(x) => {} 
        
           //     <A extends B>(x) => {} 
        
           // 
        
           // TSX: 
        
           // 
        
           //   A JSX element: 
        
           //     <A>(x) => {}</A> 
        
           //     <A extends>(x) => {}</A> 
        
           //     <A extends={false}>(x) => {}</A> 
        
           // 
        
           //   An arrow function with type parameters: 
        
           //     <A, B>(x) => {} 
        
           //     <A extends B>(x) => {} 
        
           // 
        
           //   A syntax error: 
        
           //     <[]>(x) 
        
           //     <A[]>(x) 
        
           //     <A>(x) => {} 
        
           //     <A = B>(x) => {}

scottmas · 2020-05-10T06:24:06Z

Wow, that is crazy. So basically, you wrote an entire JS/TS parser from scratch, albeit a bit lossy since the resulting AST only records tokens that will actually end up in the generated js bundle.

Another question: is it a non-goal of esbuild to support golang plugins that can "hook" into the AST before the JS bundle is actually generated? Or to have any sort of plug and play architecture? Babel for example (according to my understanding) has just become a glorified AST specification with a million different plugins and a couple "official" plugins.

evanw · 2020-05-10T17:53:29Z

Yes, that's a non-goal. The AST format is currently an internal implementation detail. AST transformation passes are all merged together into as few passes as possible, which is great for speed but not for a plug-and-play architecture.

The goals of this project are roughly a) speed and correctness, b) build a real bundler, and c) inspire the community to build faster tools. It's hard enough building a working bundler from scratch with all of these features. It would be a distraction from these goals to also attempt to build a novel plug-and-play architecture with abstractions that enable a large community-run project at the same time, especially since doing that often comes at the expense of performance.

Perhaps once this project is more mature it can then be factored into something more like that, or another project that's more like that architecture can be started. Right now esbuild is still an experiment.

scottmas · 2020-05-10T19:45:10Z

That’s awesome. After playing around with Figma and seeing the black magic that enabled it to work in the browser, you were already a legend. This project just cements your legendary status in my mind. Thanks so much for taking the time to respond to these questions.

kazzkiq · 2020-05-10T22:37:49Z

Just out of curiosity. Does that means that esbuild would not throw type errors? e.g.

function greetings(name: string) {
  return `hello, ${name}`;
}

greetings(42); // would throw type error if compiled with tsc

evanw · 2020-05-11T01:00:12Z

Does that means that esbuild would not throw type errors?

Yes, for a few reasons:

The type checker is where most of the work on the TypeScript compiler goes. It's much harder to port than just the transpiler because it changes all the time, whereas new syntax features are only added rarely (roughly at the same speed as JavaScript) so it's much easier to only handle TypeScript syntax.
All bundlers that care about speed won't do type checking because it's not relevant to the goal of producing a JavaScript bundle as fast as possible. Even if you wanted to check types, for maximal speed you'd want type checking to run completely in parallel since the two tasks are disjoint. So it's basically a completely independent project from a bundler.

evanw · 2020-05-11T01:25:00Z

I've thought a bit about how one might do this because their type checker is really slow, and it definitely breaks my flow sometimes. It's especially annoying to not be able to see type errors in files you don't currently have open in their IDE, which I assume is turned off because it's too slow.

A somewhat straightforward way to speed it up without the blessing of the TypeScript team and without signing up for manually porting every commit would be to come up with an automated porting process to a more efficient language. This would likely only be a moderate speedup because it's still the same single-threaded code, and you'd still have to handle a lot of JavaScript's quirks that slow it down. But you could probably cut some corners and special-case the conversion for the type of JavaScript that the TypeScript compiler team writes.

There may be some things about the compiler code that using a more efficient language could speed up, even if you don't rewrite their code. For example, I think the TypeScript compiler makes pretty heavy use of megamorphic object shapes for its AST nodes, which is a somewhat of a performance anti-pattern for modern JavaScript JITs with inline caches. There are other languages that don't have that problem. A language with an optimizing compiler may also be able to do certain optimizations such as method devirtualization and function inlining that the JavaScript JIT can't do in certain cases. These performance improvements are usually hard to notice because it's death-by-a-thousand-cuts everywhere, but they can be a significant performance improvement (potentially 2x faster?). Here is some proof that wins like this exist: microsoft/TypeScript#39247.

I say this with the caveat that I haven't extensively profiled the TypeScript compiler itself, but I do have extensive experience with writing a web programming language with an optimizing compiler that had to deal with similar issues.

But the real way to speed up the TypeScript compiler would be to get the TypeScript compiler team itself on board and motivated about getting major performance improvements in their compiler (major enough to consider either porting to a native language or writing their TypeScript differently to make automatic porting easier). That's going to be more sustainable and could also lead to bigger wins since you could potentially make algorithmic speedups, similar to what esbuild does for JavaScript bundling. But I assume that's never going to happen because the TypeScript compiler project is very mature at this point. I'm guessing their type checking algorithm is also not straightforward to parallelize so it may not be possible to go faster on multi-core machines.

So it would be an interesting project to tackle for sure! I might attempt it some day. But it's a separate project from esbuild.

tooolbox · 2020-07-21T05:08:22Z

I just have to note that your megamorphic object shapes link was enlightening, educational, and highly entertaining. (eagle to mother goose unknown shape spotted operation DEOPT is a go)

scottmas closed this as completed May 10, 2020

evanw mentioned this issue Jul 20, 2020

Are you considering writing tsserver in golang? #280

Closed

evanw mentioned this issue Jun 5, 2021

What is the order of execution of esbuild's plugins and loaders, and is there any relevant documentation address? #1347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Do you reimplement the typescript ast? #101

Question: Do you reimplement the typescript ast? #101

scottmas commented May 9, 2020

evanw commented May 9, 2020

scottmas commented May 10, 2020

evanw commented May 10, 2020

scottmas commented May 10, 2020

kazzkiq commented May 10, 2020

evanw commented May 11, 2020

evanw commented May 11, 2020 •

edited

tooolbox commented Jul 21, 2020

Question: Do you reimplement the typescript ast? #101

Question: Do you reimplement the typescript ast? #101

Comments

scottmas commented May 9, 2020

evanw commented May 9, 2020

scottmas commented May 10, 2020

evanw commented May 10, 2020

scottmas commented May 10, 2020

kazzkiq commented May 10, 2020

evanw commented May 11, 2020

evanw commented May 11, 2020 • edited

tooolbox commented Jul 21, 2020

evanw commented May 11, 2020 •

edited