Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goroutine leaks when using datachannel with a multi mux with single port #2738

Open
tyohan opened this issue Apr 8, 2024 · 3 comments
Open
Labels
bug Something isn't working difficulty:hard

Comments

@tyohan
Copy link

tyohan commented Apr 8, 2024

Your environment.

  • Version: bd25613
  • Browser: Reproduced with a test

What did you do?

I saw a goroutine leaks in my SFU when using datachannel with a multi mux with single port. The go routine is not able Able to reproduce this with a test in my fork. The test is failed caused by routine check after run a test.

image

What did you expect?

Pass the test that I added when reproduce this issue.

The test is not close the single port muxer which is expected because when running a SFU server with a single port muxer the ice.NewMultiUDPMuxFromPort mux listener will keep open until the SFU is shut down. The test can be passed when the mux is closed, but not when the mux is keep open.

When I check with pprof, the blocked goroutines are listed like this:

3 @ 0x43e54e 0x46dc19 0x46dbf9 0x47ae45 0x873516 0x878434 0x935aba 0x471a01
#	0x46dbf8	sync.runtime_notifyListWait+0x138				/usr/local/go/src/runtime/sema.go:527
#	0x47ae44	sync.(*Cond).Wait+0x84						/usr/local/go/src/sync/cond.go:70
#	0x873515	github.com/pion/sctp.(*Stream).ReadSCTP+0xd5			/go/pkg/mod/github.com/pion/sctp@v1.8.13/stream.go:146
#	0x878433	github.com/pion/datachannel.(*DataChannel).ReadDataChannel+0x53	/go/pkg/mod/github.com/pion/datachannel@v1.5.5/datachannel.go:193
#	0x935ab9	github.com/pion/webrtc/v3.(*DataChannel).readLoop+0xb9		/go/pkg/mod/github.com/pion/webrtc/v3@v3.2.32/datachannel.go:361

2 @ 0x43e54e 0x44e985 0x815c6f 0x9239fc 0x88cd0f 0x471a01
#	0x815c6e	github.com/pion/transport/v2/packetio.(*Buffer).Read+0x1ae	/go/pkg/mod/github.com/pion/transport/v2@v2.2.4/packetio/buffer.go:267
#	0x9239fb	github.com/pion/webrtc/v3/internal/mux.(*Endpoint).Read+0x1b	/go/pkg/mod/github.com/pion/webrtc/v3@v3.2.32/internal/mux/endpoint.go:40
#	0x88cd0e	github.com/pion/srtp/v2.(*session).start.func1+0xae		/go/pkg/mod/github.com/pion/srtp/v2@v2.0.18/session.go:144

And based from this, I traced the issue is caused by sync.(*Cond).Wait()and it never resolved even the peer connection is closed. I assumed the mux endpoint is not get the closed event because the mux is actually never closed when using a single port muxer. I happy to fix this bug to help me learn the codebase and able to contribute more to this Pion project, but will be helpful if there is any pointing to where I should looking.

Thanks

@cnderrauber
Copy link
Member

The test failed because udpmux start goroutine to listen to the udp port and it is not closed in the test that is expected to fail, after running your test code locally I can't find any data-channel related goroutine in the test report.

@cnderrauber
Copy link
Member

3 @ 0x43e54e 0x46dc19 0x46dbf9 0x47ae45 0x873516 0x878434 0x935aba 0x471a01
#	0x46dbf8	sync.runtime_notifyListWait+0x138				/usr/local/go/src/runtime/sema.go:527
#	0x47ae44	sync.(*Cond).Wait+0x84						/usr/local/go/src/sync/cond.go:70
#	0x873515	github.com/pion/sctp.(*Stream).ReadSCTP+0xd5			/go/pkg/mod/github.com/pion/sctp@v1.8.13/stream.go:146
#	0x878433	github.com/pion/datachannel.(*DataChannel).ReadDataChannel+0x53	/go/pkg/mod/github.com/pion/datachannel@v1.5.5/datachannel.go:193
#	0x935ab9	github.com/pion/webrtc/v3.(*DataChannel).readLoop+0xb9		/go/pkg/mod/github.com/pion/webrtc/v3@v3.2.32/datachannel.go:361

2 @ 0x43e54e 0x44e985 0x815c6f 0x9239fc 0x88cd0f 0x471a01
#	0x815c6e	github.com/pion/transport/v2/packetio.(*Buffer).Read+0x1ae	/go/pkg/mod/github.com/pion/transport/v2@v2.2.4/packetio/buffer.go:267
#	0x9239fb	github.com/pion/webrtc/v3/internal/mux.(*Endpoint).Read+0x1b	/go/pkg/mod/github.com/pion/webrtc/v3@v3.2.32/internal/mux/endpoint.go:40
#	0x88cd0e	github.com/pion/srtp/v2.(*session).start.func1+0xae		/go/pkg/mod/github.com/pion/srtp/v2@v2.0.18/session.go:144

It seems like a peerconnection leak in you code that the srtp session keeps opening.

@tyohan
Copy link
Author

tyohan commented Apr 11, 2024

@cnderrauber thank you for trying my code. This issue might not directly related to data-channel but it is more to the single port muxer. The SRTP session keeps opening because the buffer read function also stuck waiting the new packet or the connection is closed which is never closed in single port muxer. I'll try to dig more and see if this is more on my end instead of Pion related. Will keep it updated in this issue.

@Sean-Der Sean-Der added bug Something isn't working difficulty:hard labels Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working difficulty:hard
Projects
None yet
Development

No branches or pull requests

3 participants