Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdin.read interface #68

Closed
ruyadorno opened this issue May 6, 2020 · 11 comments
Closed

stdin.read interface #68

ruyadorno opened this issue May 6, 2020 · 11 comments

Comments

@ruyadorno
Copy link
Member

ruyadorno commented May 6, 2020

I'm pretty sure I heard someone in one of the meetings in the past mention it before but I couldn't find any issue so I'm filling this one so that we can track/discuss it.

Description

A different interface for reading from standard input, making it simpler to create scripts that just need to consume a blob of data from stdin at once.

Desired behaviors

  • Enable simple oneliner scripts
  • Avoid Streams interface
  • TBD

Userland alternatives

/cc @Ethan-Arrowood

@ruyadorno
Copy link
Member Author

I'm really not sure how feasible it is and what an useful API would look like but I just want to make sure we have a place to follow up with the discussion, feel free to contribute more userland alternatives, desired outcomes and correct me if anything sounds wrong in the original description.

@Ethan-Arrowood
Copy link

For reference, I was creating a script that inflates the output of a git object on the command line. The idea was I could pipe the output of cat .git/objects/<hash> to node -e 'zlib.inflateSync(process.stdin.read()).pipe(process.stdout)' for a quick inflate script as I test some code i'm writing.

The solution for my original problem was utilizing zlib.createInflate()

node -e 'process.stdin.pipe(zlib.createInflate()).pipe(process.stdout)'

@Ethan-Arrowood
Copy link

This isn't a pressing feature for me either, but it is an interesting use case and benefit to the userland. So I'm happy to champion if we think it is a useful addition

@coreyfarrell
Copy link
Member

Could Buffer.fromStream(stream, (error, buffer) => {}) be a reasonable API? This could be promisified to support const buffer = await fromStream(stream);.

Additional possibly related prior art: https://github.com/maxogden/concat-stream/ (lacks an isTTY check).

@sam-github
Copy link

Isn't the pipe solution both more succinct, and will not load the entire git object into memory (exploding if its larger than available memory)?

Note that reading a stream of unknown length (stdin, for example) into a memory buffer involves double copying (at least) all the data, as well as potentially unconstrained memory allocation (streams don't have to have an end, cat /dev/zero | node -e '....').

Why would concat-stream need a isTTY check?

@coreyfarrell
Copy link
Member

I agree the pipe solution is probably best for what @Ethan-Arrowood was doing and reading an entire stream into memory is inefficient / has risks. I have run into situations where "just get me the complete data" has been very useful (admittedly always in tests).

I don't think concat-stream needs a isTTY check, I was just commenting for the purpose of prior art you would have to guard against process.stdin.isTTY to use it as a way to get the entire stdin buffer.

After reading @sam-github's comment and my own realization that I've never used this sort of functionality in production I have to ask is this something that even belongs in the node.js core?

@Ethan-Arrowood
Copy link

After reading @sam-github's comment and my own realization that I've never used this sort of functionality in production I have to ask is this something that even belongs in the node.js core?

That is a great point @coreyfarrell I doubt I'd use the script for anything other than debugging/testing in my terminal.

@ruyadorno
Copy link
Member Author

ruyadorno commented May 16, 2020

now that TLA landed on master, combined with the async/await stream support we have a more reasonable interface for consuming process.stdin in such eval oneliners:

-cat README.md | node -e "inp=''; process.stdin.on('data',c=>inp+=c).on('end',()=>process.stdout.write(inp.replace(/[a-z]/g, 'x')))"
+cat README.md | node -e "for await (const c of process.stdin) process.stdout.write(c.toString().replace(/[a-z]/g, 'x'))"

Note: Requires --experimental-top-level-await and --input-type=module flags

@ruyadorno
Copy link
Member Author

I'm pretty sure I heard someone in one of the meetings in the past mention it before but I couldn't find any issue so I'm filling this one so that we can track/discuss it.

oh, it sounds like that person may be @bengl (ref: https://twitter.com/bengl/status/1402014875870765057)

@bengl
Copy link
Member

bengl commented Jun 8, 2021

@ruyadorno

It looks like the future looks something like this:

cat README.md | node -e "process.stdout.write(Buffer.concat(await process.stdin.toArray()).toString().replace(/[a-z]/g, 'x'))"

It would be nice to get rid of the Buffer.concat with some version of flat() that works interoperably between Arrays and TypedArrays.

Alternatively, some kind of async .concat() or .consume() on buffer streams would be pretty cool. On non-buffer streams it could give an array of all the objects.

If there's a worry that the promise would never resolve if the stream doesn't end: Consider that we already have this effect synchronously with [...iteratorThatNeverEnds].

bengl added a commit to bengl/node that referenced this issue Jun 9, 2021
Returns a promise fulfilling with a sensible representation of the
entire stream's data, depending on the stream type.

Buffer streams collect to a single contiguous Buffer. Buffer streams
with encoding set collect to a single contiguous string. Object-mode
streams collect to an array of objects.

Limiting is allowed, enabling situations like capping request body
sizes, or only consuming a limited amount of process.stdin.

Ref: nodejs#35192
Ref: nodejs/tooling#68
@ruyadorno
Copy link
Member Author

oh thanks @bengl ! following your PR I learned about the Stream consumers API, that might be what I was looking for when I first opened this issue originally:

const { text } = require('stream/consumers')
await text(process.stdin)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants