Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LL-DASH unusable in production #268

Closed
basisbit opened this issue Feb 1, 2021 · 15 comments
Closed

LL-DASH unusable in production #268

basisbit opened this issue Feb 1, 2021 · 15 comments
Labels
bug Confirmed as bug

Comments

@basisbit
Copy link
Contributor

basisbit commented Feb 1, 2021

Describe the bug
LL-DASH streaming is currently broken in master. It works when using Chromium based webbrowsers and only as long as there is no frame dropped and no network packet dropped.
In typical production setups, LL-DASH often stops playing and the web-player just spins for some time until it eventually tries to restart the stream. Sometimes it tries to load segments that return 404, but the next segment with a higher number returns successfully when manually HTTP GET requested. LL-DASH playback in Firefox fails most of the tries.

To Reproduce
Steps to reproduce the behavior:

  1. Create the minimal setup: Set Server.xml of Origin to this and Server.xml of Edge to this
  2. Use current OBS to stream to the Origin server
  3. Try to play the LL-DASH stream using this html file in Chrome, Firefox, Safari
  4. See that it only plays (sometimes) in Chrome, but not in Firefox. Also it is very unstable and typically unusable in Chrome, as soon as the OBS streamer doesn't

Expected behavior
Stream should play quite stable for LL-DASH

Logs
Nothing to see from Firefox's failed attempts, but when it fails for Chrome, usually you see something like this in the log:

edge_1    | [2021-02-01 04:09:13.410] I [SegWorker:26] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/init_audio_ll.m4s) from 84
.118.161.138:63208 : Segment number : 0 Duration : 0
edge_1    | [2021-02-01 04:09:13.410] I [SegWorker:26] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publis
her. LLDASH(4)/Stream total(4)/App total(4)
edge_1    | [2021-02-01 04:09:13.534] I [SegWorker:23] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/33_video_ll.m4s) from 84.1
18.161.138:63210 : Segment number : 68 Duration : 4
edge_1    | [2021-02-01 04:09:13.534] I [SegWorker:23] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publis
her. LLDASH(5)/Stream total(5)/App total(5)
edge_1    | [2021-02-01 04:09:13.535] I [SegWorker:24] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/33_audio_ll.m4s) from 84.1
18.161.138:63209 : Segment number : 67 Duration : 4
edge_1    | [2021-02-01 04:09:13.535] I [SegWorker:24] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publis
her. LLDASH(6)/Stream total(6)/App total(6)
edge_1    | [2021-02-01 04:09:15.009] I [SegPubReq:27] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publi
sher. Concurrent Viewers[LLDASH(5)/Stream total(5)/App total(5)]
edge_1    | [2021-02-01 04:09:15.010] I [SegPubReq:27] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publi
sher. Concurrent Viewers[LLDASH(4)/Stream total(4)/App total(4)]
edge_1    | [2021-02-01 04:09:15.010] I [SegPubReq:27] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publi
sher. Concurrent Viewers[LLDASH(3)/Stream total(3)/App total(3)]
edge_1    | [2021-02-01 04:09:15.010] I [SegPubReq:27] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publi
sher. Concurrent Viewers[LLDASH(2)/Stream total(2)/App total(2)]
edge_1    | [2021-02-01 04:09:15.946] W [AppWorker:39] ov.Queue | queue.h:192  | [0x7f104405b688] ClientSocket #8 123.123.123.123:63197 (of #4) size has exceeded the
threshold: queue: 304, threshold: 100, peak: 304
origin_1  | [2021-02-01 04:09:20.484] C [StreamWorker:598] OVT | ovt_session.cpp:105  | time : 8
edge_1    | [2021-02-01 04:09:20.895] I [SegWorker:24] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/35_video_ll.m4s) from 84.1
18.161.138:63220 : Segment number : 72 Duration : 4
edge_1    | [2021-02-01 04:09:20.895] I [SegWorker:24] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publisher. LLDASH(3)/Stream total(3)/App total(3)
origin_1  | [2021-02-01 04:09:21.559] C [StreamWorker:598] OVT | ovt_session.cpp:105  | time : 4
origin_1  | [2021-02-01 04:09:24.254] C [StreamWorker:598] OVT | ovt_session.cpp:105  | time : 10
edge_1    | [2021-02-01 04:09:28.938] E [PhyPortSerSock:12] Socket | server_socket.cpp:207  | [0x562497cb68e0] [#4] [Epoll] EPOLLIN | EPOLLERR | EPOLLHUP | EPOLLRDHUP
edge_1    | [2021-02-01 04:09:29.149] I [SegWorker:24] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/37_audio_ll.m4s) from 123.123.123.123:63228 : Segment number : 75 Duration : 5
edge_1    | [2021-02-01 04:09:29.149] I [SegWorker:24] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publisher. LLDASH(4)/Stream total(4)/App total(4)
edge_1    | [2021-02-01 04:09:30.921] I [SegWorker:25] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/37_video_ll.m4s) from 123.123.123.123:63229 : Segment number : 76 Duration : 4
edge_1    | [2021-02-01 04:09:36.010] I [SegPubReq:27] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publisher. Concurrent Viewers[LLDASH(3)/Stream total(3)/App total(3)]

Server (please complete the following information):

  • OS: Ubuntu 20.04 + current Docker from official docker repository. Dedicated server with dedicated 1Gb/s network uplink in a decent datacenter where we also successfully host hundreds of servers for webrtc-video-conferencing.
  • OvenMediaEngine Version: today's master
  • Branch: master

Player (please complete the following information):

  • Device: Chrome on Android, Firefox on Android (LG G6), Chrome 90 on Windows 10, Firefox 85 on Windows 10 (AMD 4750U), Safari in VMWare virtual machine

Additional context
Related to #182, #248, #162.

Increasing lowLatencyMpdLiveDelay did not fix the issue (only found out that playback fails with higher values because OME eventually already has deleted (or never created?) some older chunk files (returns 404 stream not found for those HTTP GET requests for some of the chunk files).

Fails in Firefox with OvenPlayer as well as official dash.js reference player 2.9.3 and 3.0.1 and 3.0.3 (others not tested).
Rarely, Firefox manages to play the stream after waiting for it for minutes, but then usually only plays for a few seconds before it loading-spins to oblivion again.

If no developer wants to put the time and effort into fixing LL-DASH for most typical use-cases, please change the documentation so users know that they should probably not spend any time on trying to get LL-DASH to work with OvenMediaEngine as server.

@getroot
Copy link
Sponsor Member

getroot commented Feb 1, 2021

We are constantly improving LLDASH to operates well in various environments and conditions. (From our findings, the current LLDASH seems to work well only in a small range of environments and conditions.) Of course, it is being delayed by higher priority tasks, but I believe all problems will be resolved. We recently resumed work on stabilizing LLDASH.

@dimiden
Copy link
Member

dimiden commented Feb 3, 2021

@basisbit
TL;DR
LL-DASH does not work normally if the time on the server where your OME is installed is different from that of Akamai time server. Please check if the server time is the same as Akamai's.

Details
I assume that your server time is probably different from Akamai server time, and this is one of the problems that people who use LL-DASH often experience.
Since LL-DASH calculates how much time has elapsed from the availabilityStartTime value of the manifest to obtain segment index to play, the time between servers and clients is critical.

For example, I suppose segment_duration is set to 2 seconds, and there is a stream created at 00:00:00.
If the current time of server is 00:00:10, the server will have a total of five segments completed: 0.m4s, 1.m4s, 2.m4s, 3.m4s, and 4.m4s.
At this point, if a client that has 1 second faster time than the server time plays this stream, it will request 5.m4s to the server and the server will return 404 Not found, which will cause the playback to fail.

DASH has a server-client time synchronization algorithm to address this issue.
One of these is to put <UTCTiming> in the manifest, and the other is to get time from the external time server, as mentioned above, Akamai time server.
Currently, OME does not use <UTCTiming> method because there is an issue where it is assumed that dash.js does not parse the millisecond part of <UTCTiming> in XML properly. This causes the player to call http://time.akamai.com/?iso&ms, which uses an external time server and is hard-coded in dash.js.
Due to the background above, the time on the server where OME running must be the same as the time on the Akamai time server.

If your server time cannot be set to the same as that of Akamai, it is recommended that you use the following code to make your own time server and then set UTCTimingSources in the OvenPlayer.

// Please set this PHP script on the server where OME is running.
// This outputs the current time in ISO 8601 date format

<?php
$timestamp_in_ms = (int)(microtime(true) * 1000);

$milliseconds = ($timestamp_in_ms % 1000);
$padded = str_pad($milliseconds, 3, '0', STR_PAD_LEFT);

$timestamp = (int)($timestamp_in_ms / 1000);
$datetime = gmdate('Y-m-d\TH:i:s', $timestamp);

echo("${datetime}.${padded}Z");
?>

// Please initialize your OvenPlayer as follows:

let player = OvenPlayer.create(...);

player.on('dashPrepared', function (dash) {
    dash.clearDefaultUTCTimingSources();
    dash.addUTCTimingSource('urn:mpeg:dash:utc:http-xsdate:2014', '<your time server URL>');
});

If the LL-DASH still does not play even though the time is right, please reply again.

@basisbit
Copy link
Contributor Author

basisbit commented Feb 3, 2021

@dimiden afaik, when using https, the akamai-time-sync mechanism should never work because of http://time.akamai.com being hardcoded in any dash.js release before 3.2.1 (to be released next week), when using dash.js embed within a https website, because of mixed content access from JS being forbidden in current versions of Firefox.

There is no php-fpm on that server, however I'll try the time server workaround.

@basisbit
Copy link
Contributor Author

basisbit commented Feb 3, 2021

the server-local time service and akamai both do approximately return the same UTC time, so time zone is not the reason. However, adding the local time source is required for anyone from EU or from California or from Brazil to be data protection laws compliant. So, thank you for pointing out how to set a custom time source!

However, Firefox does still not play the LL-DASH stream.

@dimiden
Copy link
Member

dimiden commented Feb 4, 2021

@basisbit
Thank you for your additional information.

To give you a little bit of a hard-coded explanation, even though the URL set in UTCTimingSource is HTTP, if the URL of the manifest is HTTPS at the time of actual use, dash.js replaces HTTP with HTTPS at here. Therefore, using HTTPS does not have any problems adding http://time.akamai.com in UTCTimingSource.

The issue that LL-DASH is not playing in Firefox is that the sample URL provided by dash.js is not playing either.
I think it's a bug in dash.js, and I'll see if there's another workaround.

image
image

@basisbit
Copy link
Contributor Author

basisbit commented Feb 5, 2021

@dimiden Thank you for the additional information! The sample video plays fine with the reference client in version 3.1.0 or 2.9.3 in Firefox 85. OvenPlayer also plays LL-DASH streams fine in the same browser as long as they don't come from OME. You can find a sample at https://jenkins.basisbit.de:943/sample.html

@dimiden
Copy link
Member

dimiden commented Mar 7, 2021

@basisbit
I missed your comment. There are URLs that play on FireFox + DASH 3.2 and those that do not.
First of all, the URL above that I said it would not play is [Akamai] Akamai Low Latency Stream (Single Rate), and the same Live latency as that generated by OME continues to increase and does not play.

The URL of the sample page you provided is a stream created by Live Simulator - [DASH-IF] Low Latency (Single-Rate) (livesim-chunked) -, which plays well, so I'm trying to find a reason why it's not playing by comparing these two differences.

The biggest problem is, I don't have time to analyze the problem right now. :)
When the task I'm currently working on is completed, I'm going to deal with this issue with the next task, so I'll let you know again when it's done!

@sephentos
Copy link

Hi,

after getting LL-Dash perfectly working with chrome I did not manage to get it working on Firefox.
I'm using latest Dash.js (v3.2.1) (Also tried with various other versions) and my client and server have exactly the same timestamp as given from time.akamai.com.
Using latest OvenPlayer v.0.9.0-2021030518-rev.57fec71.

Firefox:

grafik

Chrome:

grafik

Same results with OvenPlayer and demo.ovenplayer.com

OvenPlayer is being hosted on a HTTPS-site, OME also have been configured with HTTPS .

Also note that on firefox it's generating many 404 like https://domain:8081/app/stream/306_audio_ll.m4s https://domain:8081/app/stream/262_audio_ll.m4s etc.

grafik

Chrome (working):

grafik

OvenPlayer JS:
        {
            file: 'https://domain:8081/app/stream/manifest_ll.mpd',
            type: 'dash',
            lowLatency: true,
            default: true
        }
...
Server.xml:
                        <DASH>
                                <Port>8080</Port>
                                <TLSPort>8081</TLSPort>
                                <WorkerCount>4</WorkerCount>
                        </DASH>
...
                        <Audio>
                                <Codec>aac</Codec>
                                <Bitrate>128000</Bitrate>
                                <Samplerate>48000</Samplerate>
                                <Channel>2</Channel>
                        </Audio>
                        <Video>
                                <Codec>h264</Codec>
                                <Bitrate>2000000</Bitrate>
                                <Framerate>30</Framerate>
                                <Width>1920</Width>
                                <Height>1080</Height>
                        </Video>

OS: Debian 10 (Linux x86_64 - 4.19.0-16-amd64, SMP Debian 4.19.181-1 (2021-03-19))
OME: v0.11.2
OvenPlayer: v.0.9.0-2021030518-rev.57fec71
Dash.js: v3.2.1 (Also tried with various other versions)

Server is not within some kind of VPN or behind a special firewall.
All required ports are open.
Again, keep in mind that it's working perfectly in chrome.

OME log:

[2021-03-29 21:24:45.544] I [SegWorker:2079] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publisher. LLDASH(3)/Stream total(3)/App total(3)
[2021-03-29 21:24:46.013] I [SckPoolSegPub:2072] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f940206e0, #34, state: Closed, TCP, 95.223.56.1:31665>) is disconnected
[2021-03-29 21:24:46.487] I [SegPubReq:2083] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publisher. Concurrent Viewers[LLDASH(2)/Stream total(2)/App total(2)]
[2021-03-29 21:24:46.487] I [SegPubReq:2083] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publisher. Concurrent Viewers[LLDASH(1)/Stream total(1)/App total(1)]
[2021-03-29 21:24:50.420] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94020ad0, #31, state: Closed, TCP, 95.223.56.1:31515>) is disconnected
[2021-03-29 21:24:50.423] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94020ec0, #32, state: Closed, TCP, 95.223.56.1:31592>) is disconnected
[2021-03-29 21:24:55.618] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f9401e060, #31, state: Closed, TCP, 95.223.56.1:31688>) is disconnected
[2021-03-29 21:24:56.759] I [SckPoolSegPub:2072] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f9401b350, #31, state: Closed, TCP, 95.223.56.1:31613>) is disconnected
[2021-03-29 21:24:56.760] W [AppWorker:2104] LLDASH | cmaf_stream_server.cpp:126  | Failed to send the chunked data for [#default#app/stream, 10_video_ll.m4s] to <ClientSocket: 0x7f3f9401b350, #31, state: Closed, TCP, 95.223.56.1:31613> (11991 bytes)
[2021-03-29 21:24:56.842] I [SegWorker:2080] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f940177e0, #31, state: Closed, TCP, 95.223.56.1:53902>) is disconnected
[2021-03-29 21:24:56.998] I [SegWorker:2081] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/init_audio_ll.m4s) from 95.223.56.1:31598 : Segment number : 0 Duration : 0
[2021-03-29 21:24:56.998] I [SegWorker:2081] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publisher. LLDASH(2)/Stream total(2)/App total(2)
[2021-03-29 21:24:56.999] I [SegWorker:2081] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94027bb0, #32, state: Closed, TCP, 95.223.56.1:31598>) is disconnected
[2021-03-29 21:24:56.003] I [SegWorker:2082] HTTPPublisher | segment_publisher.cpp:188  | Segment requested (#default#app/stream/init_video_ll.m4s) from 95.223.56.1:21888 : Segment number : 0 Duration : 0
[2021-03-29 21:24:56.003] I [SegWorker:2082] Monitor | stream_metrics.cpp:119  | A new session has started playing #default#app/stream on the LLDASH publisher. LLDASH(3)/Stream total(3)/App total(3)
[2021-03-29 21:24:56.004] I [SegWorker:2082] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94020180, #31, state: Closed, TCP, 95.223.56.1:21888>) is disconnected
[2021-03-29 21:24:58.488] I [SegPubReq:2083] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publisher. Concurrent Viewers[LLDASH(2)/Stream total(2)/App total(2)]
[2021-03-29 21:24:58.488] I [SegPubReq:2083] Monitor | stream_metrics.cpp:144  | A session has been stopped playing #default#app/stream on the LLDASH publisher. Concurrent Viewers[LLDASH(1)/Stream total(1)/App total(1)]
[2021-03-29 21:25:00.404] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f9401f730, #31, state: Closed, TCP, 95.223.56.1:21871>) is disconnected
[2021-03-29 21:25:00.408] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94021bb0, #32, state: Closed, TCP, 95.223.56.1:53925>) is disconnected
[2021-03-29 21:25:05.505] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94021fa0, #31, state: Closed, TCP, 95.223.56.1:31526>) is disconnected
[2021-03-29 21:25:05.508] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f940225e0, #32, state: Closed, TCP, 95.223.56.1:31622>) is disconnected
[2021-03-29 21:25:10.411] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f940295c0, #31, state: Closed, TCP, 95.223.56.1:31699>) is disconnected
[2021-03-29 21:25:15.664] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f940287b0, #31, state: Closed, TCP, 95.223.56.1:53930>) is disconnected
[2021-03-29 21:25:20.364] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94028430, #31, state: Closed, TCP, 95.223.56.1:31587>) is disconnected
[2021-03-29 21:25:25.503] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f9401bdc0, #31, state: Closed, TCP, 95.223.56.1:22012>) is disconnected
[2021-03-29 21:25:27.187] I [SegWorker:2080] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94029c00, #32, state: Closed, TCP, 95.223.56.1:21995>) is disconnected
[2021-03-29 21:25:30.392] I [AppWorker:2104] Socket.Server | server_socket.cpp:129  | [#15] [0x55574034dde0] Client(<ClientSocket: 0x7f3f94028d70, #31, state: Closed, TCP, 95.223.56.1:31684>) is disconnected

@sephentos
Copy link

sephentos commented Apr 16, 2021

I can confirm that this issue still exists within the new released v0.11.3 containing the change of UTCTiming.
Firefox and Safari still unable to play LL-DASH with v0.11.3.
@getroot

@basisbit
Copy link
Contributor Author

basisbit commented Jun 4, 2021

any update on this?

@sephentos
Copy link

Only update I can contribute is that it still does not work with v0.12.0 / current master state.
LL-DASH still not working on firefox/safari for us.

@dimiden
Copy link
Member

dimiden commented Jun 8, 2021

I hope LL-DASH stabilizes as soon as possible, but I'm currently working on another project, so I can't spend much time on this work.
I'll comment if there is any update on this task.

@getroot
Copy link
Sponsor Member

getroot commented Jun 2, 2022

We're sorry, but with the release of LLHLS, we have decided to no longer update LLDASH. Because LLHLS has better performance and compatibility, we decided it was a better decision for everyone to focus more of our energy on it.

Instead, we will continue to focus more on LLHLS and WebRTC. Thanks for your contribution.

@basisbit
Copy link
Contributor Author

basisbit commented Jun 2, 2022

LLHLS support? yay! That is definitely a huge improvement! THANK YOU SO MUCH!!!

@basisbit basisbit closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2022
@getroot
Copy link
Sponsor Member

getroot commented Jun 2, 2022

@basisbit Yes, LLHLS was released with HTTP/2. Please check the issues below.

#766

ABR is also supported.

#777

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed as bug
Projects
None yet
Development

No branches or pull requests

4 participants