Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datachannels have a great latency for poor network #206

Open
leeliliang opened this issue Oct 11, 2021 · 9 comments
Open

Datachannels have a great latency for poor network #206

leeliliang opened this issue Oct 11, 2021 · 9 comments
Assignees

Comments

@leeliliang
Copy link

leeliliang commented Oct 11, 2021

What did you do?

Hi,
I test the signaling delay in a poor network with signaling round trip time, dataChannel has a poor performance compared to websocket and kcp.
rtt = recv(timestamp)-send(timestamp)
I use pion as a dataChannel server, and also P2p communication with web,the result also behaved badly.
I'm confused, is sctp protocol has a poor performance in a poor network?

result:

1
2
3

What did you expect?

What happened?

@leeliliang leeliliang changed the title datachannel has a great latency for Poor Network datachannel has a great latency for poor network Oct 11, 2021
@Sean-Der
Copy link
Member

Sean-Der commented Nov 9, 2021

Hey @leeliliang

I would check out w3c/webrtc-extensions#71

I haven't looked into this issue myself, but other members of the Pion community are interested in it. @daonb is interested in this and @enobufs knows this area deeply as well

@enobufs
Copy link
Member

enobufs commented Nov 9, 2021

Interesting. I thought it was pretty comparable with TCP. Let me take a look this. @leeliliang could you tell me the data size for the heartbeat, also the delay in the path in your setup? (is that impairment simulated?) I will try to reproduce it on my end.

Looks like max-RTO may need to be optimized. I am curious to know how aggressive kcp's retransmission is. It's a congestion control, which tries to recover from congestion (seeing packet loss) by slowing down, so it's not a bad thing. For the large difference, however, Pion SCTP might be too nice to the network, or could be a bug to fix.

@leeliliang
Copy link
Author

@Sean-Der
I am very glad for your reply , i am confuse this for long time.
Very grateful!

@enobufs
The data size is 593 byte, also test for another data size, larger the amount of data, the worse the performance.
Receiving the news immediately repented, and i did not set the delay in the path
The SCTP did not have Tail Loss Probe(TLP), so the last message send and lost, it must retransmission the data by RTO, Backoff algorithm interval too long, take it worse.
so i think is there anyway to make it better? or SCTP protocol need to optimize with some new mechanism in poor network. Because the performance in web with DataChannel also not very well.

KCP use speed ​​mode, Normal mode also much better than DataChannel in poor network.

Recently I compared the weak network performance of DataChannel, KCP, QUIC, and WebSocket for Signaling to make communication more reliable and low latency.
I did not find the production of dataChannel, i think there is some reason.

Finally i am very appreciate for your contribution for this.

@daonb
Copy link
Contributor

daonb commented Nov 11, 2021

hi @leeliliang and thanks for the detailed info. I use the data channels to stream a pty and I feel those spikes when I hit a key and in extreme cases, see it printed a minute later.

If happened quite a lot during the lockdown when my two daughters were zooming on my meager 2MB uplink copper wire. I did some research and I think SCTP can be improved. IMO, the exponential backoff doesn't belong in the protocol.

The exponential backoff comes from TCP in the early 80s when it was used by FTP to transfer most of the data. The routers of the 80s where PDP/11 with a little memory and overflowing stack. The expontial backoff was a hack to help them live longer and survive the long nights of file transfers.

Luckily we don't need to worry about routers anymore. We do have to worry about being realtime especially in a protocol designed to control. I've asked to register with our overlords at the IETF so I can start a discussion in their forum but got nothing.

Maybe I should start a branch nobackoff ?

@enobufs
Copy link
Member

enobufs commented Nov 12, 2021

(EDITED)
Let me take it back what I mentioned here. RFC 5827 is just one of the related works. I am going over other discussion/proposal related to the issue (including RFC 6675). I will summarize my findings here later.

@enobufs
Copy link
Member

enobufs commented Nov 14, 2021

There were many experimental and proposed IETF drafts out there regarding the improvement of loss recovery including tail loss issue.
Here are some of the documents I went over... (in random order)

  • SCTP Tail Loss Recovery - Tsvwg, IETF 90, Toronto (pdf)
  • SCTP Tail Loss Recovery Enhancements, SCTP TLR (pdf slide)
  • draft-dukkipati-tcpm-tcp-loss-probe-01
  • Tail Loss Probe (romain-jacotin/quic)
  • An Evaluation of Tail Loss Recovery Mechanisms for TCP (pdf)
  • Faster Small Downloads: TCP Early Retransmit and Tail Loss Probe
  • SCTP as an Universal Multiplexing Layer (pdf)
  • RFC 5827: Early Retransmit for TCP and Stream Control Transmission Protocol (SCTP)
  • RFC 8985: The RACK-TLP Loss Detection Algorithm for TCP
  • RACK for SCTP (pdf - slide, IEEE paper)

Links in the above list were gone.. :( If you are interested, take a look at my note (gdoc).

It was really not obvious what the best way to resolve tail loss issue would be, however, the last two items in the above list appeared to be the most active.

Dr. Michael Tuexen (chair of IETF tcpm WG, author of usrsctp, co-author of "RACK for SCTP") kindly replied to my questions and told me he would just go with "RACK for SCTP" - SCTP implementation of RFC 8985. Now I am reading the draft. It is a combination of time-based loss detection and Google's TLP (Tail Loss Probe). Luckily, it is a sender-only solution and it is a lot easier to implement in SCTP (than in TCP) because SCTP already has many of the feature required by RACK. (no need to worry about interoperability issue)

@leeliliang
Copy link
Author

leeliliang commented Nov 15, 2021

hi @leeliliang and thanks for the detailed info. I use the data channels to stream a pty and I feel those spikes when I hit a key and in extreme cases, see it printed a minute later.

If happened quite a lot during the lockdown when my two daughters were zooming on my meager 2MB uplink copper wire. I did some research and I think SCTP can be improved. IMO, the exponential backoff doesn't belong in the protocol.

The exponential backoff comes from TCP in the early 80s when it was used by FTP to transfer most of the data. The routers of the 80s where PDP/11 with a little memory and overflowing stack. The expontial backoff was a hack to help them live longer and survive the long nights of file transfers.

Luckily we don't need to worry about routers anymore. We do have to worry about being realtime especially in a protocol designed to control. I've asked to register with our overlords at the IETF so I can start a discussion in their forum but got nothing.

Maybe I should start a branch nobackoff ?

@daonb
I think it is not necessary, but i am not sure about it, deeper understanding is needed for me, i think this problem will be alleviated, but not solve this problem.
In my opinion SCTP protocol need to optimize with some new mechanism in poor network.
Thanks for your efforts for this problem.

@leeliliang
Copy link
Author

There were many experimental and proposed IETF drafts out there regarding the improvement of loss recovery including tail loss issue. Here are some of the documents I went over... (in random order)

  • SCTP Tail Loss Recovery - Tsvwg, IETF 90, Toronto (pdf)
  • SCTP Tail Loss Recovery Enhancements, SCTP TLR (pdf slide)
  • draft-dukkipati-tcpm-tcp-loss-probe-01
  • Tail Loss Probe (romain-jacotin/quic)
  • An Evaluation of Tail Loss Recovery Mechanisms for TCP (pdf)
  • Faster Small Downloads: TCP Early Retransmit and Tail Loss Probe
  • SCTP as an Universal Multiplexing Layer (pdf)
  • RFC 5827: Early Retransmit for TCP and Stream Control Transmission Protocol (SCTP)
  • RFC 8985: The RACK-TLP Loss Detection Algorithm for TCP
  • RACK for SCTP (pdf - slide, IEEE paper)

Links in the above list were gone.. :( If you are interested, take a look at my note (gdoc).

It was really not obvious what the best way to resolve tail loss issue would be, however, the last two items in the above list appeared to be the most active.

Dr. Michael Tuxen (author of usrsctp, co-author of "RACK for SCTP") kindly replied to my questions and told me he would just go with "RACK for SCTP" - SCTP implementation of RFC 8985. Now I am reading the draft. It is a combination of time-based loss detection and Google's TLP (Tail Loss Probe). Luckily, it is a sender-only solution and it is a lot easier to implement in SCTP (than in TCP) because SCTP already has many of the feature required by RACK. (no need to worry about interoperability issue)

yeap!@enobufs
I am glad to learn about so many new mechanisms.
Expect better performance!

@daonb
Copy link
Contributor

daonb commented Nov 18, 2021

In my opinion SCTP protocol need to optimize with some new mechanism in poor network.

I agree. Maybe we can help. The most important thing for such a mechanism is a good performance test and you started one. If you publish your code, I promise to help and make it into something we can use it to test different mechanisms and help the next draft author try his mechanism and prove his case.

@stv0g stv0g changed the title datachannel has a great latency for poor network Datachannels have a great latency for poor network Feb 24, 2023
@enobufs enobufs self-assigned this Jan 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants