Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WritableStreamDefaultController.releaseBackpressure() #1190

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

MattiasBuelens
Copy link
Collaborator

@MattiasBuelens MattiasBuelens commented Nov 25, 2021

This adds a new releaseBackpressure() method (name to be decided) to WritableStreamDefaultController, allowing a WritableStream with HWM = 0 to resolve the pending writer.ready promise manually.

This also integrates with TransformStreams: when pull() is called on the readable side, the transform stream calls releaseBackpressure() on its writable side. This allows piping through TransformStreams with HWM = 0 on both their readable and writable side, enabling "unbuffered" transformations:

const rs = new ReadableStream({
  start(c) {
    c.enqueue("a");
    c.enqueue("b");
    c.enqueue("c");
    c.close();
  }
});

const upperCaseTransform = new TransformStream({
  transform(chunk, controller) {
    controller.enqueue(chunk.toUpperCase());
  }
}, { highWaterMark: 0 }, { highWaterMark: 0 });

const ws = new WritableStream({
  write(chunk) {
    console.log(chunk);
  }
});

await rs.pipeThrough(upperCaseTransform).pipeTo(ws);

Previously, the above snippet would stall.

Fixes #1158.

To do:

  • Write spec text
  • Write more tests
  • Bikeshed on the method name

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff


Preview | Diff

@domenic
Copy link
Member

domenic commented Nov 29, 2021

Looks reasonable so far. Should we make unbuffered transforms the default? I suspect it's more desirable, and hopefully wouldn't be too breaking...

@MattiasBuelens
Copy link
Collaborator Author

Should we make unbuffered transforms the default? I suspect it's more desirable, and hopefully wouldn't be too breaking...

I'd like that a lot.

  • Identity transform streams would no unnecessarily longer increase the total buffer size of a pipe chain.
  • In a pipe chain, we would only start pulling from the source as soon as we've connected our chain to either a buffered transform stream or to the final sink. (Right now, we start pulling from the original source stream as soon as we pipe it to the first transform stream, which might not yet be connected to a sink.)

I tried out the change locally, and it looks like we only need to change a couple of tests around writer.desiredSize. So I think we can get away with it? 🤞

@ricea
Copy link
Collaborator

ricea commented Nov 30, 2021

Looks reasonable so far. Should we make unbuffered transforms the default? I suspect it's more desirable, and hopefully wouldn't be too breaking...

I'm concerned that people may be relying on the existing behaviour. With buffering, all the transforms in a pipe can run in parallel, but without it they will run in series. This could cause mysterious performance regressions.

@MattiasBuelens
Copy link
Collaborator Author

Hmm, good point. We don't know if or when transform() will call enqueue(). It might do it synchronously (e.g. for an identity transform or for TextDecoderStream), in which case buffering doesn't help and writable HWM = 0 would make more sense. But we cannot assume that. 🤔

If we don't change the default, we should document how authors can construct an unbuffered transform stream, and when they should use this. We should also go through other specifications that use TransformStream and check if they should have writable HWM 0 or 1.

@MattiasBuelens MattiasBuelens force-pushed the ws-release-backpressure branch 2 times, most recently from ccea28d to 453f784 Compare November 30, 2021 14:56
@MattiasBuelens
Copy link
Collaborator Author

I've updated set up a TransformStream to take optional arguments for the writable/readable HWMs and size algorithms, such that we can use it in create an identity TransformStream.

As I said before, we'll want to update other specifications to also set the writable HWM to 0 for synchronous unbuffered transform streams, such as CompressionStream and TextDecoderSteam.

@ricea
Copy link
Collaborator

ricea commented Dec 1, 2021

I haven't had a chance to look at the implementation in detail. I think Chrome is interested in this functionality in general. I will leave it to @domenic to make the call on when we are happy with the spec change.

In the spirit of bikeshedding, I generally like the name releaseBackpressure() although I'd make it one word if I could. One problem I see with the name is that people might interpret it to mean that it will release backpressure even if desiredSize is negative.

Alternative names I've thought of

  • acceptChunk() - has the same problem
  • ready() - ambiguous
  • markReady() - also might be misinterpreted
  • pull() - confusing?

@domenic
Copy link
Member

domenic commented Dec 1, 2021

So basically if you're doing a sync transform, you should use HWM = 0, but an async transform should use HWM = 1 (or higher)? That's rough for web developers. I guess the conservative default is to stick with 1 and tell people to customize for sync transforms...

Maybe we could make it better in the future with something like TransformStream.fromSyncMap(x => y), but I dunno, then you might want to have flush logic...

Alternative names I've thought of

I like markReady() since the ready promise is the existing developer-facing manifestation of backpressure. I agree it's hard to communicate the conditional nature of the API, without a long name.

index.bs Outdated
@@ -1150,8 +1150,8 @@ interface ReadableStreamDefaultReader {
ReadableStreamDefaultReader includes ReadableStreamGenericReader;

dictionary ReadableStreamDefaultReadResult {
any value;
boolean done;
any value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth pulling the Web IDL and editorial changes out into a separate PR that we can just approve and merge ASAP.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, opened #1192.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dangit, I probably should have moved 4a4a5ee out to that PR too. 🤦‍♂️

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
algorithm <dfn export for="TransformStream/set up"><var>writableSizeAlgorithm</var></dfn>, an
optional number <dfn export for="TransformStream/set up"><var>readableHighWaterMark</var></dfn>
(default 0), and an optional algorithm <dfn export
for="TransformStream/set up"><var>readableSizeAlgorithm</var></dfn>, perform the following steps.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have separate set up algorithms for sync vs. async transforms, and thus take away the responsibility of picking the right high water mark from the other spec author. So far other specs haven't needed these extra customization points and I think it might be simpler not to give them out.

(On the other hand, the fact that existing byte-stream-transform specs are calculating the size of all chunks as 1 is a bit unfortunate...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I remove the HWM and size algorithm arguments from "set up a ReadableStream" and "set up a WritableStream" too then?

(On the other hand, the fact that existing byte-stream-transform specs are calculating the size of all chunks as 1 is a bit unfortunate...)

Hmm, true. Once we have proper byte support for WritableStream and TransformStream, we'll have "set up with byte reading support" versions for those too, but for now we're in a bit of an awkward position. 😕

Maybe we can already provide such algorithms, even though they won't vend BYOB readers or BYOB writers yet?

@jan-ivar
Copy link

I'm concerned that people may be relying on the existing behaviour. With buffering, all the transforms in a pipe can run in parallel,

@ricea I assume you mean "all the transforms in a pipe can run in parallel with each other" here. I.e. they all have data available to work on at the same time.

but without it they will run in series.

And here your concern is all but one will sit idle waiting for data upstream, as observed from any one point in time?

I don't think that'd be the case. Assuming sufficient input data and transforms taking approximately the same time, all async transforms should still run in parallel with each other, even with HWM = 0. The only difference being that, from the perspective of each transform, the transform upstream of it is working on a newer chunk, at the same time.

  • Snapshot at time X: camera → transformA(frame4) → transformB(frame3) → transformC(frame2) → sink(frame1)

These seem like reasonable assumptions for a healthy real-time stream.

But what seems true (and maybe this is what you worry about) is that this ideal never happens, and even tiny transform delays will accumulate (delayABC), which could blow a realtime buffer. Sure.

For example e.g. transformB spikes, taking 50% longer than usual on frame3, but then immediately makes up for it by taking half the normal time on frame4. Here HWM = 1 might improve resilience to variance. But to smooth out any longer delays than that will take a deeper buffer anyway, so I don't think the difference of 0 to 1 is foundational, unless I'm missing something.

This could cause mysterious performance regressions.

@jan-ivar
Copy link

The above is my picture (which could be off) of how the pipe is filled in response to requests for new data that propagate up the pipe chain based on transforms resolving their pending writer.ready promise, which

  • for HWM=0: would be from markReady() in response to a downstream pull, and
  • for HWM=1: would be from desiredSize dipping above zero, effectively also in response to a downstream pull

I suppose one difference is that during this propagation (is it async?), no work is happening in parallel in the former case for a brief moment, while in the latter case we know each transform has at least one item to chew on? In either case downstream is driving this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Piping to writable streams with HWM 0?
4 participants