Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parallel tsc processes that build the same dependency project #42216

Closed
5 tasks done
Knagis opened this issue Jan 5, 2021 · 7 comments
Closed
5 tasks done

Support parallel tsc processes that build the same dependency project #42216

Knagis opened this issue Jan 5, 2021 · 7 comments
Labels
Duplicate An existing issue was already created

Comments

@Knagis
Copy link
Contributor

Knagis commented Jan 5, 2021

Suggestion

πŸ” Search Terms

parallel build, getBuildInfo Unexpected end of JSON input

βœ… Viability Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.

⭐ Suggestion

Please support running multiple parallel tsc processes, each building a different composite project, however sharing common dependencies.

πŸ“ƒ Motivating Example

We have a rather large monorepo. Some of the packages in the repo reference others via typescript composite projects. When we validate typescript files in all packages, we run tsc (and various other package level validations like linters, jest etc.) in parallel to speed things up. However, when handling composite projects, we occasionally get failed builds because one tsc tries to read .buildinfo files that is being written at the exact same time by another process.

The exact error we usually see is

SyntaxError: Unexpected end of JSON input
    at JSON.parse (<anonymous>)
    at Object.getBuildInfo (/workdir/node_modules/typescript/lib/tsc.js:81380:21)
    at getUpToDateStatusWorker (/workdir/node_modules/typescript/lib/tsc.js:92604:45)
    at getUpToDateStatus (/workdir/node_modules/typescript/lib/tsc.js:92638:22)
    at getNextInvalidatedProject (/workdir/node_modules/typescript/lib/tsc.js:92369:26)
    at build (/workdir/node_modules/typescript/lib/tsc.js:92741:38)
    at Object.build (/workdir/node_modules/typescript/lib/tsc.js:92904:67)
    at performBuild (/workdir/node_modules/typescript/lib/tsc.js:93325:73)
    at Object.executeCommandLine (/workdir/node_modules/typescript/lib/tsc.js:93270:24)
    at Object.<anonymous> (/workdir/node_modules/typescript/lib/tsc.js:93551:4)

The big issue is that build fails, though of course if would be even better if the parallel builds also made sure they don't start building a dependency if another build is already working on that. Atomic writes might solve might solve the build failing and some sort of cross process locking could avoid parallel rebuilds.

πŸ’» Use Cases

Running parallel tsc processes on build agents so that shared dependencies are built only once as needed.

@MartinJohns
Copy link
Contributor

Related: #30235.

@RyanCavanaugh
Copy link
Member

This seems external / out of scope to me. Many, many tools will fail if you run them concurrently targeting the same operation, and tsc intentionally doesn't try rely on any host IPC primitives like would be needed here. Your parallelization launcher should ingest the project graph, invoking tsc without --build, and only parallelize at locations where this conflict isn't going to occur. The only "fix" we'd do here would be effectively to implement #30235.

@RyanCavanaugh RyanCavanaugh added the Duplicate An existing issue was already created label Jan 5, 2021
@Knagis
Copy link
Contributor Author

Knagis commented Jan 5, 2021

@RyanCavanaugh What about changing emits to be atomic writes (likely via renaming)? If buildinfo is written after the outputs are written, atomic writes would ensure that parallel tsc processes would not see invalid state on the disk - both would happily build in parallel and write the files twice - but the primary goal would be achieved as there wouldn't be an error of tsc trying to read file that is being written at the same moment.

@Knagis
Copy link
Contributor Author

Knagis commented Jan 5, 2021

Though atomic emit was deemed out of scope in #38690 - perhaps TSC itself throwing error instead of an external watcher would be a reason to reconsider?

@RyanCavanaugh
Copy link
Member

Atomic writes introduce other problems (file watchers from other processes see the temp file, for example, and you'd have to design the atomic write operation to also be resistant to two processes doing this at the same time which is another set of tradeoffs). You can host tsc from the API and pass it a custom host that implements writeFile as an atomic write if this unblocks your scenario.

@Knagis
Copy link
Contributor Author

Knagis commented Jan 5, 2021

yes, I will have to write a wrapper (or monkey patch tsc) with atomic write at least for buildinfo.. Eh, if only node exposed api to open files with locks..

@Knagis
Copy link
Contributor Author

Knagis commented Jun 7, 2023

My patch to enable this locking: #53763 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate An existing issue was already created
Projects
None yet
Development

No branches or pull requests

3 participants