Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebSocket::inner method #337

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

main--
Copy link

@main-- main-- commented May 26, 2023

This is useful to call e.g. buffered_amount.

hamza1311
hamza1311 previously approved these changes Jun 17, 2023
Copy link
Collaborator

@hamza1311 hamza1311 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why expose inner instead of adding a buffered_amount method?

@@ -199,6 +199,11 @@ impl WebSocket {
})
}

/// Provides a reference to the JS `WebSocket` instance wrapped by this struct.
pub fn inner(&self) -> &web_sys::WebSocket {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be into_inner and consume self.

@hamza1311 hamza1311 dismissed their stale review June 17, 2023 12:40

Accidental approval

@main--
Copy link
Author

main-- commented Jun 17, 2023

This should be into_inner and consume self.

That would defeat the whole point of this PR: if I had a way to turn a raw websocket back into a gloo websocket, then I could just create the WS myself, clone the reference, and then pass that to gloo.

Why expose inner instead of adding a buffered_amount method?

While buffered_amount would suffice for my particular use case, direct access to the underlying JS websocket reference is generally convenient for cases where gloo’s API is not sufficient.

Furthermore, it does not look like any memory safety invariants can be broken with access to the inner reference. And if I intentionally break gloo by e.g. overriding the onmessage handler, then that’s obviously on me.

@hamza1311
Copy link
Collaborator

And if I intentionally break gloo by e.g. overriding the onmessage handler, then that’s obviously on me.

The API should prevent you from being able to make such a mistake

@main--
Copy link
Author

main-- commented Jun 17, 2023

You’re not going to be able to prevent someone from monkeypatching the WebSocket constructor 🤷‍♂️

As I said: buffered_amount alone would be sufficient for my personal use case. But as things are right now, if I have a use case that isn’t directly served by gloo’s API then I am effectively forced to fall back to raw Websockets entirely, and that is obviously substantially more error-prone than having a small escape hatch.

Btw, the reason I need buffered_amount is to provide backpressure. One might argue that gloo should implement backpressure (since it’s not intuitive for a Sink to drop messages on the floor just because it’s overwhelmed), so that’s yet another option, albeit one that requires a significant amount of code and introduces a subtle change in behavior.

@futursolo
Copy link
Collaborator

futursolo commented Jun 17, 2023

I am also in favour of a consumption-based API into_inner.
web_sys is an FFI interface, when exposing the underlying raw instance the wrapper is usually consumed.

You’re not going to be able to prevent someone from monkeypatching the WebSocket constructor

The same argument can be raised to say that you can use LLDB to modify the memory and violate compile time safety put in place by Rust compiler. Because something that is not possible to completely prevent from happening doesn't mean that any effort put towards it to provide a better sound API is not worth it.

On the topic of providing back pressure, I do not think there is a good way of doing it unless WebSocket provides events to help monitoring the watermarks (Such as events to signal buffer level) so we can wake up sink based on it. SinkExt::flush() also doesn't work because of this.

since it’s not intuitive for a Sink to drop messages on the floor just because it’s overwhelmed

I don't think JavaScript's WebSocket implementation will drop messages just because it is back logged as well? I think the buffer in WebSocket is unbound and can grow indefinitely. (Until an unknown limit set by the browser is reached.)

@main--
Copy link
Author

main-- commented Jun 19, 2023

You’re not going to be able to prevent someone from monkeypatching the WebSocket constructor

The same argument can be raised to say that you can use LLDB to modify the memory and violate compile time safety put in place by Rust compiler. Because something that is not possible to completely prevent from happening doesn't mean that any effort put towards it to provide a better sound API is not worth it.

By the same reasoning, it should not be possible to write unsafe code in Rust at all. The existence of an explicit escape hatch does not make an API unsound.

My point is that a hypothetical bad-faith user of your API will always be able to break things, no matter how hard you try. If I want my code to erroneously not receive websocket messages, then I don't have to resort to e.g. overriding gloo's onmessage handler - I can just "accidentally" try to read items from a futures::stream::Pending instead of the gloo_net::websocket::futures::WebSocket. It sounds ridiculous, but it's really the same thing: I am intentionally and very explicitly shooting myself in the foot, and no amount of safeguards is going to prevent that.

gloo makes the rules on what one is and isn't allowed to do with the API. You're saying there is an implicit rule that monkeypatching the WebSocket constructor is not allowed and may cause gloo to misbehave. This is perfectly reasonable. What I don't understand is why you're opposed to applying the same approach to the proposed inner function. A very simple rule here could e.g. be: "you are not allowed to mutate the underlying websocket using the reference returned from inner". I don't see why you would consider this API to be unsound, considering that it is impossible to accidentally misuse by accident and even if misused, it will never violate memory safety.

On the topic of providing back pressure, I do not think there is a good way of doing it unless WebSocket provides events to help monitoring the watermarks (Such as events to signal buffer level) so we can wake up sink based on it. SinkExt::flush() also doesn't work because of this.

There is unfortunately no elegant way to do it, but a polling-based approach (as outlined by e.g. the Chrome developers here) using setTimeout is what people use, and it does work quite reliably. I greatly resent polling myself, but it appears to be the only possible solution here. Plus, objectively speaking the work of regularly polling bufferedAmount is certainly negligible compared to that the work the machine is already doing in order to process and send the actual messages.

For a general-purpose offering like gloo there are some difficult questions regarding huge messages, so that's why I'm not sure if it's a fit - but it's definitely worth discussing.

I don't think JavaScript's WebSocket implementation will drop messages just because it is back logged as well? I think the buffer in WebSocket is unbound and can grow indefinitely. (Until an unknown limit set by the browser is reached.)

You are correct: the spec does not provide a backpressure mechanism, so the limit is implementation-dependent. Quoting MDN:

If the data can't be sent (for example, because it needs to be buffered but the buffer is full), the socket is closed automatically.
[...]
If you call send() when the connection is in the CLOSING or CLOSED states, the browser will silently discard the data.

Exceeding the send buffer is punished by killing the connection and silently dropping all subsequently sent data on the floor.

@hamza1311
Copy link
Collaborator

A way to take care of this would be to have into_raw and from_raw methods, similar to Box. I'm not sure if I want to make the latter unsafe (as that's not what unsafe is for). A note in the documentation suggesting that the raw instance must be obtained via into_raw should suffice.

I would still prefer to add whatever methods are needed to WebSocket to provide the data that the underlying instance provides. If anything other than bufferedAmount, I'd be happy to add that as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants