Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple readers at once #121

Merged
merged 1 commit into from
Mar 14, 2024
Merged

Conversation

ehmicky
Copy link
Collaborator

@ehmicky ehmicky commented Mar 14, 2024

Fixes #119.

I checked Node.js default iterable Readable[Symbol.asyncIterator] (source here), for comparison.

I believe the new behavior to be fairly close to the previous behavior, except that multiple concurrent readers are now possible. That being said, there are slight differences, such as:

  • Some error thrown in an edge case having a different error.code
  • Some iterations processing smaller chunks at a time
  • And other minor details

For most users, this should not matter, but to be on the safe side, this probably should be in a major release. We probably should try to solve #116 as part of that major release (I am looking into it right now).

Note: to double check, I also ran this PR against Execa automated tests, and they all passed.

highWatermark: stream.readableHighWaterMark,
})) {
yield chunk;
}
Copy link
Collaborator Author

@ehmicky ehmicky Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on() changed with Node 20.0.0, more specifically with this PR.

That PR fixes many subtle bugs with on(), so it took some effort to make this work on both Node 18 and 20, but this should work now.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe target Node.js 20? I think we should target Node.js 20 in Execa anyway. Node.js 18 goes out of maintenance in April, which is not far away.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I believe Node 18 is in maintenance one more year: until April 2025.

I also initially thought it was out of maintenance next month. Maybe we got this impression because they unsupported Node 16 earlier than usual (was in September 2023 instead of April 2024) due to some security issue.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. Yeah, I totally misread that...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it took some effort to make this work on both Node 18 and 20, but this should work now.

Maybe worth adding a TODO comment about how it could be simplified when Node.js 20 can be targeted.

Copy link
Collaborator Author

@ehmicky ehmicky Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially using the iterable returned by once() directly, and calling iterator.return()/iterator.throw().

To support both Node 18 and 20, I switched to using the signal option, which did not change between Node 18 and 20. It turns out this is actually a better option for any Node version.

So the code can remain the same even after removing support for Node 18.

Ugh. Yeah, I totally misread that...

I did too! I realized this a few weeks ago. I do think that early deprecation of Node 16 might be the cause for our confusion.

try {
for await (const [chunk] of nodeImports.events.on(stream, 'data', {
signal: controller.signal,
highWatermark: stream.readableHighWaterMark,
Copy link
Collaborator Author

@ehmicky ehmicky Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The highWatermark option is undocumented but it is quite nifty. It was added by nodejs/node#41276 for readline Interface.

I have opened nodejs/node#52078 since I'm not sure it was left undocumented on purpose.

Regardless, even if the option disappeared, this option is only meant to reduce memory consumption (by automatically pausing/resuming the stream when too much data is buffering), so it would not break the usage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, they are already working on it! 🎉

nodejs/node#52080

import('node:stream/promises'),
]);
nodeImports = {events, streamPromises};
};
Copy link
Collaborator Author

@ehmicky ehmicky Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The on() and finished() functions have all the logic we need for this PR, and it does not quite make sense to re-implement them.

However, browsers don't have those methods. They also don't need to, since they don't use Readable Node.js streams, so they won't run this part of the logic. But the import statements would fail in browsers.

This PR is currently fixing this using dynamic imports. This is not great. Besides it being hacky, it's got one practical issue which is: when executing getStream(), users might assume that the stream data and error events are being listened to as soon as getStream() starts. But with those dynamic imports, that's not the case.

This means if another consumer has called stream.on('data'), it might start pulling some data that getStream() might miss while the dynamic imports are resolving. This also means any error event on the stream will be unhandled (which crashes the process) while the dynamic imports are resolving.

One potential solution could be to use static imports and hope that browser users' bundlers ignore/delete imports that start with the node: prefix. Is this something bundlers now do in 2024? I am not sure. 🤔

Another potential solution could be to have a separate entrypoint for browser users. This might be cleaner and allow us to add more Node-specific and browser-specific, if we ever need to (related: #116).

I am really not sure what the best approach is here. What are your thoughts about this?

Copy link
Collaborator Author

@ehmicky ehmicky Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also combine the last two solutions?

  • We use static import ... from 'node:stream' like a regular Node module
  • We provide a second entrypoint identical, except without those static imports, for users who both:
    • must support browsers
    • use a bundler that cannot handle node:* imports

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlikely scenario, but regardless, I think using separate browser and node exports in package.json may be the best solution in this case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.
Should I do this in a second PR after this one?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ehmicky ehmicky force-pushed the stream-multiple-consumers branch 5 times, most recently from 52536da to 9779d56 Compare March 14, 2024 02:49
@sindresorhus sindresorhus merged commit a51d085 into main Mar 14, 2024
6 checks passed
@sindresorhus sindresorhus deleted the stream-multiple-consumers branch March 14, 2024 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow multiple readers at once
2 participants