Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webrtc: wait for FIN_ACK before closing data channels #2615

Merged
merged 34 commits into from
Feb 21, 2024

Conversation

sukunrt
Copy link
Member

@sukunrt sukunrt commented Oct 22, 2023

part of #2656

This removes the control message queue because that only ever adds 500 bytes to the queue. This will not cause too much HOL Blocking for other streams.

There is no problem with backpressure since sctp writes never actually block. There is no blocking on sctp Write.

To verify:

func TestStreamBackpressure(t *testing.T) {
	client, _ := getDetachedDataChannels(t)

	clientStr := newStream(client.dc, client.rwc, nil)

	b := make([]byte, 1024)
	for i := 0; i < 1e7; i++ {
		n, err := clientStr.dataChannel.Write(b)
		if n != 1024 {
			t.Fatalf("failed to write to datachannel")
		}
		if err != nil {
			t.Fatalf("failed to write to datachannel")
		}
	}
}

@sukunrt sukunrt changed the base branch from master to webrtcprivate/integration-tests October 22, 2023 20:45
@sukunrt sukunrt force-pushed the webrtcprivate/integration-tests branch from 06d4eea to fd471fa Compare October 23, 2023 12:19
@sukunrt sukunrt force-pushed the sukun/webrtc-fin-ack branch 3 times, most recently from 6e4d8c2 to ca1cf7d Compare October 23, 2023 17:07
@sukunrt sukunrt marked this pull request as ready for review October 23, 2023 17:07
@sukunrt sukunrt changed the base branch from webrtcprivate/integration-tests to master December 1, 2023 17:53
@sukunrt sukunrt force-pushed the sukun/webrtc-fin-ack branch 18 times, most recently from 0fe1285 to 1036003 Compare December 5, 2023 18:11
Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review.

Discussed offline: If we move closing of the stream scope to the transports, we don't need to introduce the AsyncCloser interface. We need to decide if that structure makes sense first.

c.closeErr = errors.New("connection closed")
c.cancel()
return c.pc.Close()
func (c *connection) closeTimedOut() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment?

p2p/transport/webrtc/connection.go Outdated Show resolved Hide resolved
p2p/transport/webrtc/connection.go Outdated Show resolved Hide resolved
p2p/test/transport/transport_test.go Show resolved Hide resolved
p2p/transport/webrtc/stream.go Show resolved Hide resolved
p2p/transport/webrtc/stream.go Outdated Show resolved Hide resolved
@sukunrt
Copy link
Member Author

sukunrt commented Feb 13, 2024

@Stebalien I have removed AsyncClose.

The difference between webrtc and yamux/quic is that you're relying on the transport stack to discard all the buffers when you do a Close. This happens with yamux. With quic, the receive side of the stream is not immediately discarded but the peer doesn't get credits to open a new stream till it cleanly finishes the stream after we have sent STOP_SENDING on.

No such mechanism exists in webrtc. There is no limit to streams you can open(At least not in pion) or unilateral stream closing since streams are reusable. In webrtc datachannel.Close is a 1RTT sync procedure over the FIN / FIN_ACK closing that we do. So an effective AsyncClose procedure needs to do it after the datachannel is Closed and not after a FIN_ACK is received. Moreover pion never drops a reference to the datachannel(including all the transport layer allocated buffers) till the connection is open. So a system like this only makes sense if we fix: pion/webrtc#2672

There is one way to handle this other than doing an explicit async close without waiting for 1RTT which would require us to check a misbehaving peer which is forcing us to keep all the datachannels open and just close the connection to such a peer. We can discuss that strategy in a separate issue. That'd also require pion/webrtc#2672

@Stebalien
Copy link
Member

With quic, the receive side of the stream is not immediately discarded but the peer doesn't get credits to open a new stream till it cleanly finishes the stream after we have sent STOP_SENDING on.

Can't we do the same thing here? My objection is adding this hack to libp2p itself as this is a transport specific detail.

@sukunrt
Copy link
Member Author

sukunrt commented Feb 13, 2024

My objection is adding this hack to libp2p itself as this is a transport specific detail.

That's a fair point. I'll open an issue to discuss.

@sukunrt sukunrt force-pushed the sukun/webrtc-fin-ack branch 3 times, most recently from 43c7f77 to e60515d Compare February 14, 2024 03:31
@sukunrt
Copy link
Member Author

sukunrt commented Feb 19, 2024

@Stebalien

Why not just call closeAndRemoveStream immediately?

I have addressed this and removed the AsyncClose logic. Is it fine to merge?

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, mostly questions, otherwise LGTM.

@@ -35,18 +37,28 @@ func (s *stream) Read(b []byte) (int, error) {
var msg pb.Message
if err := s.reader.ReadMsg(&msg); err != nil {
s.mx.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not related, but this locking is strange. We could write this as:

s.mx.Unlock()
var msg pb.Message
err := s.reader.ReadMsg(&msg)
s.mx.Lock()
if err != nil {
    // ...
}
// no second "lock" call needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better. Thanks!

p2p/transport/webrtc/stream.go Outdated Show resolved Hide resolved
p2p/transport/webrtc/stream.go Outdated Show resolved Hide resolved
defer s.mx.Unlock()

s.closeForShutdownErr = closeErr
s.SetReadDeadline(time.Now().Add(-1 * time.Hour))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a strange way to do this. It'll also be racy (e.g., if the user calls the same function to change the read deadline).

Do we need to change the read deadline? If this is the only way to do this, let's add something to SetReadDeadline so it can't be changed again by the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only way to unblock a waiting reader. Even closing the datachannel won't unblock it till 1RTT later when the peer replies with a close datachannel on its part.

I've changed this to make SetReadDeadline a noop after the read half is closed so the user can't change this again.

Comment on lines 223 to 225
select {
case s.sendStateChanged <- struct{}{}:
default:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future cleanup: make some form of notifyStateChanged function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced the three channels with 1 writeStateChanged and using a method to notify on the channel removing all the select blocks everywhere.

s.mx.Lock()
defer s.mx.Unlock()
// Unblock any Read call waiting on reader.ReadMsg
s.SetReadDeadline(time.Now().Add(-1 * time.Hour))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same race comment as above.

p2p/transport/webrtc/stream_write.go Outdated Show resolved Hide resolved
@@ -17,6 +18,7 @@ import (
quicproxy "github.com/quic-go/quic-go/integrationtests/tools/proxy"

"golang.org/x/crypto/sha3"
"golang.org/x/exp/rand"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the rand you want to use?

Copy link
Member Author

@sukunrt sukunrt Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did mostly for this: golang/go#21835

But looks like math/rand.Read is deprecated and the recommendation is to always use crypto.Read. So using that now.

@sukunrt sukunrt dismissed stale reviews from Stebalien and Jorropo February 21, 2024 15:12

I have addressed your concerns here. I want to do a release today. Happy to take any comments in a follow up PR and if something major is missed we can do a patch release.

@sukunrt sukunrt merged commit f487b81 into master Feb 21, 2024
11 checks passed
@Stebalien
Copy link
Member

I have addressed your concerns here. I want to do a release today. Happy to take any comments in a follow up PR and if something major is missed we can do a patch release.

Nope, LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants