New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharing SourceFile
s between createWatchProgram
calls
#35676
Comments
cc: @sheetalkamat |
@sheetalkamat can you offer some advice? |
Not to be annoying, but pinging on this to make sure it didn't get lost (because I opened it over the holiday period). |
Thanks for the ping! @sheetalkamat and I are planning to take a look at this soon. |
Need more time to investigate whether the cause is because program outlives the need or because there are multiple programs at a time when one could work with one program at a time. The investigation and changes to api surely don't make 3.8.1 bar and need to be done in next milestone. |
Closing this in favor of typescript-eslint/typescript-eslint#1718 |
Search Terms
SourceFile, createWatchProgram, DocumentRegistry, WatchOfConfigFile
Background
A bit of background first, over in @typescript-eslint, we use the compiler API to parse files. The user passes us a set of
tsconfig
s, and we create a program for each of those projects. ESLint then asks us to parse a file, and we take the the pre-parsed AST, transform it to the ESTree AST spec, and return the new AST, as well as theProgram
. The former ESLint traverses to power all lint rules, the latter our type-aware lint rules use to do statically analysis.Our parser has to support two use cases:
Because of the way ESLint is designed[1], it has to support both of these use cases at the same time, without any information about which case the current run is.
Because of this, we built our parser on top of the
createWatchProgram
API, because it provides aBuilderProgram
instead of aProgram
, which is more amenable to the incremental updates we get in the IDE use case. We provide a customisedWatchCompilerHost
, which does not rely on filesystem watchers[2], and instead manually call the watch callbacks whenever ESLint tells us there's new file content. There's also a bit of code around detecting if a file is new, or if a file has been deleted.If you're interested, the code is here: https://github.com/typescript-eslint/typescript-eslint/blob/835378e505f462d965ce35cc4c81f8eee1704a30/packages/typescript-estree/src/create-program/createWatchProgram.ts#L227
[1] - ESLint only really has one API - the CLIEngine, which is used in the same way in both their CLI, and 3rd-party IDE extensions, such as
vscode-eslint
[2] - We cannot use file watchers due to the CLI use case, which has to exit automatically as soon as the lint run is finished. I tried non-persistent file watchers, but they are are pretty slow, filled with edge cases and are all around a real pain. Additionally file watchers are async, which doesn't work in ESLint's synchronous world.
Problem
So with all that being said, what's the problem?
For multi-package monorepos, a user will pass us a set of
tsconfig
s - one per package. We then go ahead and callcreateWatchProgram
once per tsconfig. Simple enough, right?The problem is we're hitting performance issues and OOMs (typescript-eslint/typescript-eslint#1192) for large projects, or just smaller, highly-interdependent projects with many packages.
As far as I can tell[3], the problem is that each program is independent, and isolated. If package A and package B both depend on package C, then we will have 3 copies of package C in 3 separate programs. For monorepos that are interdependent, this can cause the memory footprint be 2 or more times higher than it should be. For monorepos that share npm dependencies, this can cause a constant increase in the memory footprint.
Of course this double-parsing is also a performance problem - ESLint runs synchronously, so we cannot parallelise, meaning every wasted parse is wasted time.
[3] This is an educated guess based on the fact that parsing
n
tsconfigs with separatecreateWatchProgram
calls causes an OOM, whereas parsing the same code under a single tsconfig with a singlecreateWatchProgram
call does not OOM.I.e.
This OOMs:
n
tsconfigs in./packages/**/tsconfig.json
which look like{ include: ["src"] }
.This does not OOM: 1 tsconfig in
.
which looks like:{ include: ["packages/**/src"] }
Solutions
I'm not entirely certain the best approach here. I've done a bit of digging around the available APIs, but it doesn't look like there's another good solution for us right now. Which is one of the reasons I'm opening this issue - I was wondering if there's a better API we can use?
If I'm right and none of the other compiler APIs will work for us, then I believe we need to do one of:
Drop lower level, and re-implement
createWatchProgram
in a way that lets us share memory (by itself, this isn't a terrible idea - we could remove a bunch of hacks if we did this, but I hate having to duplicate all of that work, considering thecreateWatchProgram
stack is some 1600 lines).Update
createWatchProgram
, so it supports sharing memory.Create a new API, similar to
createWatchProgram
, but which better fits our use case and supports sharing memory.In terms of sharing memory, I noticed that the Language Service API supports passing a shared
DocumentRegistry
, so I am curious if this problem is "as simple as" going with option (2), and updatingcreateWatchProgram
to accept aDocumentRegistry
?I'm not sure - I'd appreciate any advice from you guys, as you are the experts. I'm happy to dive in an attempt a PR here if that's what's required.
The text was updated successfully, but these errors were encountered: