Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Prepare scripts in workspaces should wait for dependencies #3034

Closed
timoxley opened this issue Apr 6, 2021 · 11 comments
Closed

[BUG] Prepare scripts in workspaces should wait for dependencies #3034

timoxley opened this issue Apr 6, 2021 · 11 comments
Labels
Priority 1 high priority issue Release 7.x work is associated with a specific npm 7 release

Comments

@timoxley
Copy link
Contributor

timoxley commented Apr 6, 2021

Current Behaviour:

It seems that prepare scripts are now run in parallel, which means that if:

  • A depends on B
  • B's prepare script takes some time. e.g. runs tsc
  • A uses B in its prepare script e.g. runs own tsc which requires B to be built first

Then A's prepare script will fail because B's has not completed.

If I switch the package.json scripts from prepare to prepublish then it seems to work without a problem.

Is there an alternative lifecycle script to the deprecated prepublish?

This may be an arborist issue.

Expected Behaviour:

npm install should wait for dependent's prepare scripts to complete before running.

Steps To Reproduce:

I've tried to capture a reproduction here:
https://github.com/timoxley/npm7-prepare-issue

git clone git@github.com:timoxley/npm7-prepare-issue.git
cd npm7-prepare-issue
npm install # this will break
npm install # works on second run
# run this to clean up and try again
npm run clean

Environment:

  • OS: MacOS 11.2.3
  • Node: v14.16.0
  • npm: 7.8.0
@timoxley timoxley added Bug thing that needs fixing Needs Triage needs review for next steps Release 7.x work is associated with a specific npm 7 release labels Apr 6, 2021
@wraithgar
Copy link
Member

All of the lifecycle scripts have a pre and post event that runs before and after the main scripts. Their reason for existing is cases like this. They admittedly only work for very simple cases like this where there are only two packages, and having one go before the other solves it. More complex setups would require a whole new spec.

So, to be clear, before prepare scripts are run any preprepare scripts are run, and will have finished by the time prepare is running. postprepare scripts will then run and all prepare scripts will have finished by the time postprepare scripts have started.

@wraithgar wraithgar removed Needs Triage needs review for next steps Bug thing that needs fixing labels Apr 8, 2021
@wraithgar
Copy link
Member

If you would like to propose a more formal solution for this problem that allows for more complex situations to be addressed, the RFC process is a good place to suggest something: https://github.com/npm/rfcs

@timoxley
Copy link
Contributor Author

timoxley commented Apr 8, 2021

Preprepare does not solve my actual problem because I have 6 packages in my workspace and a nesting depth of 3.

I can make a rfc but I feel like it's more of a bug than a new feature, npm should be running the scripts for child dependencies before the parents.

@timoxley
Copy link
Contributor Author

timoxley commented May 11, 2021

@wraithgar Would like to reiterate this really is a bug and currently I believe this makes it impossible to use typescript in a workspace since you can't ensure a package is actually built before something tries to use it.

@wraithgar
Copy link
Member

Sorry, didn't see that this was specifically related to workspaces when I replied.

npm install does not currently support workspaces. It is being worked on right now, and prepare script ordering will be something that is handled correctly when it does ship.

@wraithgar
Copy link
Member

You can follow the progress of adding workspace support to all of arborist's reify commands in the cli here

@timoxley
Copy link
Contributor Author

npm install does not currently support workspaces.

@wraithgar What does that mean? It's in the docs: https://docs.npmjs.com/cli/v7/using-npm/workspaces#installing-workspaces

@wraithgar
Copy link
Member

You're right, looks like install is already pulling in workspaces, which is different than the enable workspaces context work being added to install. Sorry for the confusion. We're on the bleeding edge here of a new thing currently being implemented, some of the terminology is overlapping.

This is already on our radar but I'll reopen this so it's being explicitly represented here in this issue too.

@wraithgar wraithgar reopened this May 12, 2021
@ruyadorno ruyadorno added the Priority 1 high priority issue label May 21, 2021
ruyadorno added a commit to npm/arborist that referenced this issue Jul 14, 2021
Adds the ability for lifecycle scripts of Link nodes to depend
on each other, such that a `prepare` script of linked module A
can make use of files that are a result of a `prepare` script of
a linked package B.

This is important in order to unlock workflows in which a workspace
needs to make use of another workspace but they all have transpilation
steps (very common in Typescript projects) so tracking the order
(and completion) in which the dependencies lifecycle scripts runs
becomes necessary.

Fixes: npm/cli#3034
ruyadorno added a commit to npm/arborist that referenced this issue Jul 14, 2021
Adds the ability for lifecycle scripts of Link nodes to depend
on each other, such that a `prepare` script of linked module A
can make use of files that are a result of a `prepare` script of
a linked package B.

This is important in order to unlock workflows in which a workspace
needs to make use of another workspace but they all have transpilation
steps (very common in Typescript projects) so tracking the order
(and completion) in which the dependencies lifecycle scripts runs
becomes necessary.

Fixes: npm/cli#3034
ruyadorno added a commit to npm/arborist that referenced this issue Jul 14, 2021
Adds the ability for lifecycle scripts of Link nodes to depend
on each other, such that a `prepare` script of linked module A
can make use of files that are a result of a `prepare` script of
a linked package B.

This is important in order to unlock workflows in which a workspace
needs to make use of another workspace but they all have transpilation
steps (very common in Typescript projects) so tracking the order
(and completion) in which the dependencies lifecycle scripts runs
becomes necessary.

Fixes: npm/cli#3034
@isaacs
Copy link
Contributor

isaacs commented Jul 22, 2021

In the short term, you can force serialized operation by doing npm install --foreground-scripts.

More generally, this does get a bit tricky in the case where the dependency graph cycles back on itself. For example, a -> b -> c -> a, where all three are workspaces with prepare scripts. So we can't just naively say "wait until all your deps prepares have completed", because it'll deadlock. At least, though, we can say "wait until all your deps prepares have completed, except those that are in the waiting queue already". So in the a->b->c->a case, it would wait on a until b runs, then wait on b until c runs, then c would run immediately, since a is already waiting for it.

@isaacs
Copy link
Contributor

isaacs commented Jul 22, 2021

Hm, exploring a bit further, there is really no way to do solve this bootstrapping problem in general that doesn't lead to either a traveling salesman problem or deadlocks.

Consider this dependency graph:

root -> (b)
a -> (b)
b -> (c, x)
c -> (d)
d -> (e, y)
e -> (a)
x -> (e, y)
y -> (c)

Let's say that:

  1. x, y, a, and e have prepare scripts
  2. all modules with prepare scripts require() their dependencies at least once within the prepare script and within the main module.
  3. all modules with prepare scripts cannot be loaded until their prepare scripts are run.

This is unresolvable. Picking any arbitrary point in the graph, we end up with a failure for at least 2 reasons:

  • a -> requires b
  • b -> depends on x's prepare script, and c being loadable
  • c -> depends on d being loadable
  • d -> depends on e's and y's prepare scripts running
  • x -> depends on e's prepare script running, and y's prepare script running
  • e -> depends on a's prepare script having run (!)
  • y -> depends on c being loadable (!!)

Furthermore, even if we say we just won't handle cycles in this way, sorting by any combination of dependency relationships or depth within the dependency graph as a heuristic doesn't really work either. Take for example the c node. There are multiple routes to it:

  • root -> b -> c
  • root -> b -> x -> y -> c

When we look at the y node, we find:

  • root -> b -> x -> y
  • root -> b -> c -> d -> y

So, if we have y and c in the queue to run a prepare script, it's impossible to reliably guess which one should go first. If we had all these paths, we could arbitrarily say we're going to sort them shortest to longest, and then use that, but that's literally the classic example of an NP-hard problem, and will spiral out of control very quickly when we have graphs with thousands of nodes.

Even if we run package scripts in serial like npm v6 did (and which you can do with the --foreground-scripts config option), or lay out the dependency graph maximally nested like npm v1 and v2 did (and which you can do with the --legacy-bundling config option), there's always going to be cases where this just doesn't work.

Simply put, there is no way we can guarantee, or ever could have guaranteed, that a package's dependencies will have their prepare scripts run prior to its own. This was working for you only accidentally, because you happened to have a non-cyclic dependency graph where sorting in alphabetical order happened to avoid the issue. You can still do this with --foreground-scripts, so if that was working before, then that is a workaround for now.

The safest assumption is that lifecycle events happen in a single platonic moment for all packages within the tree, and design your modules accordingly.

@isaacs isaacs closed this as completed Jul 22, 2021
@pro-wh
Copy link

pro-wh commented Jul 23, 2021

Thanks for the workaround with --foreground-scripts. I think that's suitable for the cases where I needed ordered prepares. And in the bigger picture, our workspaces will be prepared and uploaded separately, so our downstream users won't need to fiddle with their flags.

But I must say that the fact that cycles can merely be represented has not often been sufficient reason to give up on something like this. for loops in JavaScript exist despite for (;;) {} being expressible. Even require cycles aren't fatal, the quandary not resolved in the style of "every require initially returns an empty object."


This [a best effort ordering based on a traversal] was working for you only accidentally, because you happened to have a non-cyclic dependency graph where sorting in alphabetical order happened to avoid the issue.

I have had a different experience with workspaces, where it's rather not by accident that we have non-cyclic dependency graphs among workspaces.

  • all modules with prepare scripts require() their dependencies at least once within the prepare script and within the main module. ... [leading to analysis of algorithms known to be hard]

It might even be reasonable to suppose that:

  • the prepare scripts execute their dependencies
  • those dependencies at runtime require their dependencies

and thus anything downstream would be needed, as probably as direct dependencies. I'd be happy to have --foreground-scripts behavior, even if npm didn't especially maximize the direct dependencies' availability.


Using the ordering from --foreground-scripts would be great for projects that have acyclic workspace dependencies; without it, setting up a project such as the one described in the first post from a repository checkout would fail every time.

Ordered prepare for projects with cyclic dependencies seems better too. If there is some permutation that would work, I'd prefer to have npm install either succeed or fail predictably rather than occasionally stumble on one of these sequences of events. Or more sinister still, it could sometimes expose outdated packages during dependents' build process, resulting in build errors that I could have sworn I already fixed.

Doing them all at once is faster for projects with packages that can be prepared independently, but among projects that have set up workspaces, I really don't know how common that is. Doing

The behavior of npm running the prepare scripts from the top level workspace is relatively new, first added in 7.20.0. Right now would be the best opportunity to make ordering like --foreground-scripts the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority 1 high priority issue Release 7.x work is associated with a specific npm 7 release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants