Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharing SourceFiles between createWatchProgram calls #35676

Closed
bradzacher opened this issue Dec 14, 2019 · 6 comments
Closed

Sharing SourceFiles between createWatchProgram calls #35676

bradzacher opened this issue Dec 14, 2019 · 6 comments
Assignees
Labels
Needs Investigation This issue needs a team member to investigate its status.

Comments

@bradzacher
Copy link
Contributor

bradzacher commented Dec 14, 2019

Search Terms

SourceFile, createWatchProgram, DocumentRegistry, WatchOfConfigFile

Background

A bit of background first, over in @typescript-eslint, we use the compiler API to parse files. The user passes us a set of tsconfigs, and we create a program for each of those projects. ESLint then asks us to parse a file, and we take the the pre-parsed AST, transform it to the ESTree AST spec, and return the new AST, as well as the Program. The former ESLint traverses to power all lint rules, the latter our type-aware lint rules use to do statically analysis.

Our parser has to support two use cases:

  • the "one-and-done" CLI run use case, where every file is parsed and linted exactly once, and then the process exits.
  • the "never ending story" IDE use case, where each file can be parsed 1..n times, each time with potentially different contents.

Because of the way ESLint is designed[1], it has to support both of these use cases at the same time, without any information about which case the current run is.

Because of this, we built our parser on top of the createWatchProgram API, because it provides a BuilderProgram instead of a Program, which is more amenable to the incremental updates we get in the IDE use case. We provide a customised WatchCompilerHost, which does not rely on filesystem watchers[2], and instead manually call the watch callbacks whenever ESLint tells us there's new file content. There's also a bit of code around detecting if a file is new, or if a file has been deleted.

If you're interested, the code is here: https://github.com/typescript-eslint/typescript-eslint/blob/835378e505f462d965ce35cc4c81f8eee1704a30/packages/typescript-estree/src/create-program/createWatchProgram.ts#L227

[1] - ESLint only really has one API - the CLIEngine, which is used in the same way in both their CLI, and 3rd-party IDE extensions, such as vscode-eslint
[2] - We cannot use file watchers due to the CLI use case, which has to exit automatically as soon as the lint run is finished. I tried non-persistent file watchers, but they are are pretty slow, filled with edge cases and are all around a real pain. Additionally file watchers are async, which doesn't work in ESLint's synchronous world.


Problem

So with all that being said, what's the problem?

For multi-package monorepos, a user will pass us a set of tsconfigs - one per package. We then go ahead and call createWatchProgram once per tsconfig. Simple enough, right?

The problem is we're hitting performance issues and OOMs (typescript-eslint/typescript-eslint#1192) for large projects, or just smaller, highly-interdependent projects with many packages.

As far as I can tell[3], the problem is that each program is independent, and isolated. If package A and package B both depend on package C, then we will have 3 copies of package C in 3 separate programs. For monorepos that are interdependent, this can cause the memory footprint be 2 or more times higher than it should be. For monorepos that share npm dependencies, this can cause a constant increase in the memory footprint.

Of course this double-parsing is also a performance problem - ESLint runs synchronously, so we cannot parallelise, meaning every wasted parse is wasted time.

[3] This is an educated guess based on the fact that parsing n tsconfigs with separate createWatchProgram calls causes an OOM, whereas parsing the same code under a single tsconfig with a single createWatchProgram call does not OOM.
I.e.
This OOMs: n tsconfigs in ./packages/**/tsconfig.json which look like { include: ["src"] }.
This does not OOM: 1 tsconfig in . which looks like: { include: ["packages/**/src"] }


Solutions

I'm not entirely certain the best approach here. I've done a bit of digging around the available APIs, but it doesn't look like there's another good solution for us right now. Which is one of the reasons I'm opening this issue - I was wondering if there's a better API we can use?

If I'm right and none of the other compiler APIs will work for us, then I believe we need to do one of:

  1. Drop lower level, and re-implement createWatchProgram in a way that lets us share memory (by itself, this isn't a terrible idea - we could remove a bunch of hacks if we did this, but I hate having to duplicate all of that work, considering the createWatchProgram stack is some 1600 lines).

  2. Update createWatchProgram, so it supports sharing memory.

  3. Create a new API, similar to createWatchProgram, but which better fits our use case and supports sharing memory.

In terms of sharing memory, I noticed that the Language Service API supports passing a shared DocumentRegistry, so I am curious if this problem is "as simple as" going with option (2), and updating createWatchProgram to accept a DocumentRegistry?

I'm not sure - I'd appreciate any advice from you guys, as you are the experts. I'm happy to dive in an attempt a PR here if that's what's required.

@uniqueiniquity
Copy link
Contributor

cc: @sheetalkamat

@RyanCavanaugh RyanCavanaugh added the Needs Investigation This issue needs a team member to investigate its status. label Dec 19, 2019
@RyanCavanaugh
Copy link
Member

@sheetalkamat can you offer some advice?

@bradzacher
Copy link
Contributor Author

Not to be annoying, but pinging on this to make sure it didn't get lost (because I opened it over the holiday period).

@uniqueiniquity
Copy link
Contributor

Thanks for the ping! @sheetalkamat and I are planning to take a look at this soon.

@RyanCavanaugh RyanCavanaugh added this to the TypeScript 3.8.1 milestone Jan 14, 2020
@sheetalkamat
Copy link
Member

Need more time to investigate whether the cause is because program outlives the need or because there are multiple programs at a time when one could work with one program at a time. The investigation and changes to api surely don't make 3.8.1 bar and need to be done in next milestone.

@sheetalkamat
Copy link
Member

Closing this in favor of typescript-eslint/typescript-eslint#1718

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Investigation This issue needs a team member to investigate its status.
Projects
None yet
Development

No branches or pull requests

4 participants