Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please consider web compatible stream() method. #51

Closed
Gozala opened this issue Jul 12, 2020 · 9 comments · Fixed by #83 or #103
Closed

Please consider web compatible stream() method. #51

Gozala opened this issue Jul 12, 2020 · 9 comments · Fixed by #83 or #103

Comments

@Gozala
Copy link
Contributor

Gozala commented Jul 12, 2020

Currently implementation deliberately diverges from the Blob spec by using node Readable stream instead web ReadableStream. This is problematic because code that runs both in browser and node can not use blob.stream() without having to check what the result is.

How about either implementing web compatible stream method, or not implementing it at all.

@Gozala
Copy link
Contributor Author

Gozala commented Jul 18, 2020

I would love to converge https://www.npmjs.com/package/web-blob with fetch-blob, however incompatibility of stream method prevents me from doing it. If I were to do the work to replace current node Readable with web ReadableStream is it going to fly ?

@jimmywarting
Copy link
Contributor

As long as the stream is async iterable node-fetch will be happy with it.
but i would rather have it be some optional dependency.

@jimmywarting jimmywarting mentioned this issue Apr 20, 2021
@jimmywarting
Copy link
Contributor

jimmywarting commented Apr 21, 2021

Made a decision to degrade the stream() to barely do what both node & whatwg stream have in common in the v3, which is the async iterable - this would make it work in browser without the need of nodes stream or buffer module. and make it unopinionated to not pick anyones side between node or whatwg stream

Guess It can stay like that until node implements whatwg stream
For those who want to really have a node stream can now do stream.Readable.from(blob.stream()) or whatwg equivalent upcoming version whatwg/streams#1018 (which can be polyfilled too)

...And for those who do not care about either node or whatwg stream can use

for await (const chunk of stream) { ... }

All doe async iterable ReadableStream is not implemented anywhere in any browser AFAIK
So you can't really do for await (const chunk of stream) { ... } just yet but you could polyfill it by adding ReadableStream.prototype[symbol.asyncIterator] also

@Gozala
Copy link
Contributor Author

Gozala commented Apr 26, 2021

Made a decision to degrade the stream() to barely do what both node & whatwg stream have in common in the v3, which is the async iterable - this would make it work in browser without the need of nodes stream or buffer module. and make it unopinionated to not pick anyones side between node or whatwg stream

In practice it is not the case:

  1. No browser implements / ships async iteration for streams yet.
  2. Streams have bunch of things that async iterables do not.

I think it would be a lot better to just not have stream() method and have a different one if it is going to be a different thing. That way it is easy to subclass and provide web API compatible version.

@jimmywarting
Copy link
Contributor

jimmywarting commented Apr 26, 2021

I think it would be a lot better to just not have stream() method and have a different one if it is going to be a different thing. That way it is easy to subclass and provide web API compatible version.

how would you go about subclassing this if it can't provide a stream method? slicing the blob, read each chunk as arraybuffer or something like that? similar to this pr/file? both text() and arrayBuffer() also depends on some form of stream() method, but it could have a internal (private-iterator) stream instead of using the public version...

fyi.

  • node-fetch is very dependent on blob.stream() and for it being (at the very least) async iterable. hence why some form of stream() is needed.
  • plus this is mostly built for the server, so I do not know why you would use fetch-blob over window.Blob in the browser that is fully spec compatible?
    ...with that in mind you would probably also have web-stream-polyfill if you run this on nodejs - which provide Symbol.asyncIterator

I can see some few solution to this...

  1. We could check if globalThis.ReadableStream exist and "upgrade" the stream method into a whatwg stream if it exist
if (globalThis.ReadableStream) {
  const orig = Blob.prototype.stream
  Blob.prototype.stream = function() {
    const iterator = orig.call(this)
    return new ReadableStream({
      pull: ctrl => iterator.next().then(({value, done}) done ? ctrl.close() : ctrl.enqueue(value))
    })
  }
}

This could be subclassed in user-land also. it is easy to check if stream is not spec compatible and see if it's a generator function or regular function with this detection script

if (Blob.prototype.stream.constructor.name === 'AsyncGeneratorFunction') {
  // not spec compatible, subclass it...
}

another solution could be to polyfill ReadableStream.prototype[Symbol.iterator] so you could actually use the for-await-of loop in browser (and everywhere else) also instead of relying on the full stream implementation.

// Here is a very basic one
if (!ReadableStream.prototype[Symbol.asyncIterator]) {
  ReadableStream.prototype[Symbol.asyncIterator] = async function* () {
    const reader = this.getReader()
    while (1) {
      const chunk = await reader.read()
      if (chunk.done) return chunk.value
      yield chunk.value
    }
  }
}

The benefit of this is that your for..of code could then handle all three cases where it's either node stream, whatwg stream or a generator (or anything with iterator symbol in its prototype)


There are few requirements on this blob implementation.

  • node-fetch is dependent on a asyncIterable blob version of blob.stream()
    • we can't simply remove stream() if whatwg streams don't exist (cuz that would break things)
  • We do not want to import/depend on a hole whatwg stream polyfill in nodejs (it's simply too large).

this was built for nodejs after all so the polyfilled version of whatwg stream and node stream are async iterator so i think it's best to use for-await-of in most cases to avoid any environmental differences

other than those two/three requirements i'm open to suggestions

@Gozala
Copy link
Contributor Author

Gozala commented Apr 28, 2021

First of all thanks you for willingness to carry on this conversation, I really appreciate it. I'll try to respond inline below

How would you go about subclassing this if it can't provide a stream method?

class Blob extends FetchBlob {
   stream() {
      return asyncIterableToReadableStream(FetchBlob.asyncIterate(this))
   }
}

Where I assume static asyncIterate(blob) could do what blob.stream() does currently. Alternatively it could be asyncIterableToReadableStream(blob[GOOD_NAME]()).

node-fetch is very dependent on blob.stream() and for it being (at the very least) async iterable. hence why some form of stream() is needed.

I'm not opposed to async iteration API in blob, on the contrary (In fact I wish blob[Symbol.asyncIterator] was standardized). I do however find incompatibly with standard stream() API problematic.

plus this is mostly built for the server, so I do not know why you would use fetch-blob over window.Blob in the browser that is fully spec compatible?
...with that in mind you would probably also have web-stream-polyfill if you run this on nodejs - which provide Symbol.asyncIterator

I think there is a bit of misunderstanding here. I do not have a case where fetch-blob would be preferred over window.Blob. I do however work with code few layers up that operates on Response, Request, Blob objects and is agnostic of runtime and in fact runs in both node and browsers. Unfortunately however API incompatibility across runtimes does cause issues and introduces complexity as it is forced to detect if blob stream is actually a stream or not and if not do something else. And that bleeds all over the place. In theory one could update APIs to pass async iterables around instead, but in practice it's difficult because:

  1. Not all components at play are under our control.
  2. There are good reasons to keep blobs as you can slice, concat and upload without loading things into memory (which is tricky to do if you target both node and browsers).

More recently I've also run into the same incompatibily problem with a code which targets both node and cloudfare workers (that has native fetch and blob impls).

We could check if globalThis.ReadableStream exist and "upgrade" the stream method into a whatwg stream if it exist

This is not a great option because while global polyfills seem fine when building an app, they don't seem ok in libraries. Having to tell library users to also bring global polyfill does not seem like a great option.

I can see some few solution to this...

You are right that one could subclass fetch-blob today like this:

class MyBlob extends FetchBlob {
   stream() {
      return asyncIterableToReadableStream(super.stream())
   }
}

And maybe it would not break any assumptions anywhere and will work.

The benefit of this is that your for..of code could then handle all three cases where it's either node stream, whatwg stream or a generator (or anything with iterator symbol in its prototype)

You are right about benefits but:

  1. There are also tradeoffs (especially in browser context)
  2. Things get complicated when dealing with components from others as you have to convince them to do the same (while they may have good reasons not to).

There are few requirements on this blob implementation.

  • node-fetch is dependent on a asyncIterable blob version of blob.stream()

It is true but again node-fetch could just as well depend on any other method. In fact I have filled the issue to that end there as well node-fetch/node-fetch#1120 and wrote POC https://github.com/web-std/node-fetch/pull/2/files

* We do not want to import/depend on a hole whatwg stream polyfill in nodejs (it's simply too large).

That is very reasonable position. But again you could choose to expose same functionality with a different name. It would work in node just the same and it would make it straight forward for someone to just combine two libs to get a whatwg compliant version.

P.S.: I do not think this alone provides much without node-fetch

@Gozala
Copy link
Contributor Author

Gozala commented Apr 28, 2021

For bit more context, I found incompatibilities so problematic that I end up:

  1. Implementing my own blob implementation https://github.com/web-std/io/tree/main/blob
  2. Implementing web File API on top of it https://github.com/web-std/io/tree/main/file
  3. Implementing web FormData API https://github.com/web-std/io/tree/main/form-data
  4. FromData support in fetch Fix/form data in response node-fetch#1118
  5. Forknig fetch that uses web streams instead https://github.com/web-std/node-fetch/pull/3/files

But unfortunately it is nearly impossible to backport fixes and I would much rather contribute to the community instead of forking it.

@jimmywarting
Copy link
Contributor

ping @bitinn, @Richienb, @tinovyatkin

@jimmywarting
Copy link
Contributor

Good news!

WHATWG streams have landed in NodeJS (as experimental)
nodejs/node#39062

Think we should start using it whenever possible, might be worth looking into adding whatwg stream polyfill now after all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants