Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(node): Cooperative Scheduling between startBuild and subProcess generators #7110

Closed
wants to merge 13 commits into from

Conversation

dpchamps
Copy link

Current Behavior

This pr patches the bug outlined here: #7031

The node:execute builder immediately exits after build when --watch=false.

Expected Behavior

The node:execute builder wait for the subprocess to exit, or SIGTERM to be received, before exiting. This is how things worked previously.

Related Issue(s)

Fixes #7031

Notes

The solution here is to essentially "cooperatively schedule" (using the term loosely) events emitted from the startBuild generator, and the events emitted from the child process.

We must allow build events to propagate (for example, when watch mode is configured), while also ensuring that the executor waits for the current process to run to completion.

I've implemented some things as classes here, because my sense was this was more in the style of the repo. Happy to capture them differently, if the team would like. Have dropped some github comments along the way, happy to discuss the impl furthur!

@vercel
Copy link

vercel bot commented Sep 23, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/nrwl/nx-dev/HPDNg9xaTVrx9CvSXgYsrYe5C7ei
✅ Preview: Canceled

[Deployment for 26a1a0e canceled]

return {
on: (eventName, cb) => {
mockSubProcess = {
on: jest.fn().mockImplementation((eventName, cb) => {
Copy link
Author

@dpchamps dpchamps Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Represents a breaking change: the type of event yielded by executeExecutor is now: https://github.com/nrwl/nx/pull/7110/files#diff-8b1daad050dbc745646b89f91cd20c497a23f51caf7eeaddb96a517a80b8f01fR28

@@ -215,6 +220,10 @@ describe('NodeExecuteBuilder', () => {
});

it('should log errors from killing the process', async () => {
(devkit.runExecutor as any).mockImplementation(function* () {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old test cases for this spec and the one below were non-deterministic:

The state of subProcess was managed as a file-level global previously. These tests would only pass when at least one previous test had been run to populate the subProcess var.

With that subProcess state encapsulated, there was no way to make this test pass without altering it.

]);

while (iteratorMap.size > 0) {
const result = await Promise.race(iteratorMap.values());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the crux of the sln, greedily take the next ready value.

Consume each individual iterator lazily: Only tick the last iterator to have won the race. On Build event, reset the state of the subProcess iterator

@@ -97,3 +98,15 @@ export interface NormalizedBuildNodeBuilderOptions
extends BuildNodeBuilderOptions {
webpackConfig: string[];
}

export interface NodeExecuteBuilderOptions {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this type into utils, because SubProcessRunner uses it as well.

error: unknown;
};

export class SubProcessRunner {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to write tests for this, awaiting feedback for where this should be in the project structure. It seems like it belongs colocated to executor.impl.ts, but I didn't see other examples of doing stuff like that.

The executor.impl tests cover the integration of this in full, but it felt big enough to split into a separate file. Let me know what you think!

@dpchamps dpchamps changed the title Execute iterator exploration fix(node): Cooperative Scheduling between startBuild and subProcess generators Sep 23, 2021
@dpchamps dpchamps marked this pull request as draft September 23, 2021 16:12
Comment on lines +398 to +404
const result = await Promise.race([
executorIterator.next(),
new Promise(() => {
mockSubProcessExit(1);
events.push(firstEvent);
}),
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const result = await Promise.race([
executorIterator.next(),
new Promise(() => {
mockSubProcessExit(1);
events.push(firstEvent);
}),
]);
const result = await Promise.race([
executorIterator.next(),
new Promise(() => {
mockSubProcessExit(1);
events.push(firstEvent);
// NOTE: this promise intentionally never resolves so that...
}),
]);

This threw me off for a sec :) Might help to leave some breadcrumbs

packages/node/src/executors/execute/execute.impl.spec.ts Outdated Show resolved Hide resolved
packages/node/src/executors/execute/execute.impl.ts Outdated Show resolved Hide resolved
packages/node/src/executors/execute/execute.impl.ts Outdated Show resolved Hide resolved
packages/node/src/executors/execute/execute.impl.ts Outdated Show resolved Hide resolved
@nx-cloud
Copy link

nx-cloud bot commented Sep 23, 2021

Nx Cloud Report

CI is running for commit 26a1a0e.

📂 Click to track the progress, see the status, the terminal output, and the build insights.


Sent with 💌 from NxCloud.

@dpchamps dpchamps marked this pull request as ready for review September 23, 2021 19:42
Copy link
Collaborator

@FrozenPandaz FrozenPandaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems way too complicated to fix the issue outlined in #7031

I believe the fix should be that executeExecutor should not yield until the subprocess dies?

  1. The executeExecutor keeps iterating over the build events
  2. When it gets a build event, it starts the process
  3. Currently, it yields.. I don't think it should do this yet?
  4. It should wait until the subprocess exits to yield..

In general, we should be yielding results of executing the process here 👀

cc @AgentEnder could you take a look at this please?

@dpchamps
Copy link
Author

dpchamps commented Sep 25, 2021

@FrozenPandaz I agree it's a non trivial fix:

From the issue:

One further thing that's tripping me up a bit here is we'd like to also add support for using process.send to send the server URL up to the builder and trigger an iteration/emission (specifically for getting the started port in our e2e tests when we start a server on port: 0). I don't see an obvious way to do it without merging async iterables but maybe y'all see something obvious?

After speaking with @llwt, we need ideally two things:

  • To yield sub-process events as they occur (@llwt is specifically asking for message).
  • But also allow for build events to come through as needed.

Consider your points 3 and 4:

3. Currently, it yields.. I don't think it should do this yet?
4. It should wait until the subprocess exits to yield..

How do we then both wait for the subprocess to complete (and also yield events from that subprocess) but also allow for new build events to propagate?

For example, when I'm in watch mode, I don't want the subprocess to hold up a new build event. But I also want to make sure that the process exits before the executor completes. So we run into a situation where we need to simultaneously wait for the startBuild iterator to complete and also the most recent subprocess.

More context, if needed
This is all happening within a `for await ... of` loop. Where we need to run one iterator to completion, but wait for at least the last side-effect to complete before returning. It's further complicated by the requirement of yielding events from that side-effect.

So if subprocess events should be yielded here, then (as far as I can see) there's no way to do it without merging the two iterators to allow them to both "tick". SubProcess being push-based doesn't help the situation. I rolled the implementations myself to take an event emitter -> async iterator, but I'm sure there are third party libs to do that as well.

Make sense? If there's a more obvious solution I'm happy to talk it out :)

@llwt
Copy link
Contributor

llwt commented Oct 19, 2021

Bump @vsavkin / @AgentEnder,

Anything we can do to move this along? Currently serving an app outside of watch mode is not possible and we're in a broken state since updating to the latest version of nx.

Happy to modify the PR, but we need a bit of guidance as to what ya'll would like to see. We don't really see a simpler way to do this with the change to async iterators, however, maybe your eyes see something we don't?

@AgentEnder
Copy link
Member

I'll take a look at this over the week, sorry for the delay.

@llwt
Copy link
Contributor

llwt commented Oct 20, 2021

Thanks!

@cdaringe
Copy link
Contributor

Hey all,

I just want to chime in with a respectful, kind plea for attention here. We've been blocked on this for a while now, and are eager to pay back the techdebt accrued while blocked by a fix. @dpchamps explanation is above is fine and good, as is @llwt 's in #7710. i want to emphasize how fundamental of a problem this is, perhaps to elevate the priority here.

here's what we cant do right now:

// my-executor.ts
runExecutor(nodeExecutorInput1)
runExecutor(nodeExecutorInput2)

can't do it.

this is because execute.impl.js does this:

function runProcess(event, options) {
    if (subProcess || !event.success) {
        return rxjs_1.EMPTY;
    }
    return new rxjs_1.Observable((subscriber) => {
        subProcess = child_process_1.fork(event.outfile, options.args, { // 👀 👀 👀
            execArgv: getExecArgv(options),
        });

both executions race to squash the subProcess, and it creates problemos. Global, mutable singleton, getting crushed by competition.

We are willing to pair and discuss this further! :) I thought i'd try my hand in articulating the problem from a different perspective. Thanks

@AgentEnder
Copy link
Member

@dpchamps Ok, I know I've said I'd look over this before but I'm going to spend some real time w/ this problem this week.

At minimum this PR would need to be rebased, but lets chat some on the issue first to make sure this is still the solution that we should pursue.

@AgentEnder
Copy link
Member

@cdaringe it seems like this issue might be a bit different to the above discussion, albeit it would disappear if you were able to wait for the first executor to finish... From the code sample it sounds more like you want to be able to run two node execute runs in parallel from a custom executor. It may be a good idea to open a new issue for that.

@dpchamps
Copy link
Author

dpchamps commented Dec 14, 2021

@AgentEnder I met with @vsavkin a while back and he agreed to take these changes into a new more general executor. The "complicatedness" comes mostly from poor naming: its not "cooperative scheduling" so much as it is a need for some type merge+prioritization for two generators.

After the discussion with Victor, we arrived at the general conclusion that these changes did not appear to be too complicated for the problem at hand.

The issue that @cdaringe is calling out is fixed in this PR due to some non-determinism caused by having module-level mutable state.

@vsavkin vsavkin force-pushed the master branch 2 times, most recently from 3a99403 to 334c230 Compare January 6, 2022 18:35
@vsavkin vsavkin assigned jaysoo and unassigned AgentEnder and nartc Jan 20, 2022
@jaysoo
Copy link
Member

jaysoo commented Jan 27, 2022

Hi @dpchamps Thanks so much for your contribution. I'm going to close this PR since we are moving focus into the @nrwl/js:node executor, which is meant to run a build then execute (same as @nrwl/node:execute).

I will double check if the issue outlined in #7031 is still present.

@jaysoo jaysoo closed this Jan 27, 2022
@dpchamps
Copy link
Author

dpchamps commented Jan 27, 2022

@jaysoo I'm curious if you and team are planning on addressing @llwt's request in #7031:

One further thing ... is we'd like to also add support for using process.send to send the server URL up to the builder and trigger an iteration/emission (specifically for getting the started port in our e2e tests when we start a server on port: 0)

This PR presented an idea for how to make that possible.

We very much need this functionality. It's arguably more important to us than the initial bug presented, and is what @vsavkin and myself spent most of our time discussing in the call.

@github-actions
Copy link

This pull request has already been merged/closed. If you experience issues related to these changes, please open a new issue referencing this pull request.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The node:execute builder immediately exits after build when --watch=false.
7 participants