feat(offline): improve the speed of offline downloads #4168

avelad · 2022-04-27T07:25:56Z

Add the possibility to configure the number of downloads in parallel per stream. If you put a higher value the download time can decrease.

Related to #4166

avelad · 2022-04-27T07:28:30Z

Examples in Demo app:

Angel One (multicodec, multilingual)

numberOfParallelSegmentsByStream=1 --> 8480ms
numberOfParallelSegmentsByStream=5 --> 7339ms (-13%)

Sintel 4k (multicodec)

numberOfParallelSegmentsByStream=1 --> 90921ms
numberOfParallelSegmentsByStream=5 --> 71798ms (-21%)

Sintel w/ trick mode (MP4 only, 720p)

numberOfParallelSegmentsByStream=1 --> 90448ms
numberOfParallelSegmentsByStream=5 --> 79293ms (-12%)

Tears of Steel (multicodec, TTML)

numberOfParallelSegmentsByStream=1 --> 79831ms
numberOfParallelSegmentsByStream=5 --> 68895ms (-13%)

joeyparrish · 2022-04-27T16:48:03Z

lib/offline/storage.js

+        const numberOfParallelSegmentsByStream =
+            config.offline.numberOfParallelSegmentsByStream;
+        if (numberOfParallelSegmentsByStream > 1) {
+          // Note: 1000 is a random number to not coincide with other groups


nit: It's not random, it's arbitrary. And you should note that 1000 is safe so long as there are fewer than 1000 actual streams.

joeyparrish · 2022-04-27T16:53:43Z

lib/offline/storage.js

+          groupId += 1000 + Math.floor(Math.random() *
+              numberOfParallelSegmentsByStream);


If the goal is to have, say, 5 parallel tracks of audio and 5 of video, this won't do that. For example:

video stream, id 1 => 1001 - 1005
audio stream, id 2 => 1002 - 1006
audio stream, id 3 => 1003 - 1007
text stream, id 4 => 1004 - 1008

So there will be mixing between groups.

However, why not mix between groups? Why not assign each segment to a random group in the range of the config variable?

const groupId = Math.floor( Math.random() * numberOfParallelSegmentsByStream);

Now that I'm looking at the problem this way, it may be better to change the config from numberOfParallelSegmentsByStream to numberOfParallelDownloads. It's not "by stream" any more, nor did it ever really need to be.

The consistency of your results would also improve if:

groupIds were assigned in a round-robin manner instead of purely random

the default were set to something like 4 or 5 instead of 1

I would then call this a "feat" instead of a "fix", since the default would be different than the old behavior. But feat vs fix here is somewhat subjective. I wouldn't insist.

joeyparrish · 2022-04-27T16:57:41Z

This is a wonderfully simple solution, but I wonder if we can do better than ~20% by coalescing requests. The ramp-up time for a TCP stream can be significant, and requesting segments over and over can make it difficult to achieve the ideal throughput.

That said, I would still take this fix as an intermediate solution, as it's very cheap, and ~20% is a definite improvement.

avelad · 2022-04-28T11:21:06Z

@joeyparrish I made multiple tests, and definitely what makes the download slow is https://github.com/shaka-project/shaka-player/blob/main/lib/offline/storage.js#L471

joeyparrish · 2022-04-28T17:37:57Z

@joeyparrish I made multiple tests, and definitely what makes the download slow is https://github.com/shaka-project/shaka-player/blob/main/lib/offline/storage.js#L471

Ah, and that's called serially after each segment download, right? So we spend some of our "download time" actually waiting on database updates, so we will never utilize all available bandwidth.

Looks like this could use a redesign. I wish we had done some performance testing on it when this design was first proposed/implemented.

The theory was that to enable us to resume downloading if an app was closed, we would need to track each segment in the DB as it was downloaded. I don't recall exactly if that resume call was ever implemented, though. So maybe this is a waste.

Should we continue with this PR to parallelize, or try to address the slow DB update instead? (Or both?)

Is it slow because of calls like this loading a lot of data?

shaka-player/lib/offline/storage.js

Line 575 in feefd7b

const manifests = await activeHandle.cell.getManifests([manifestId]);

Or this big double-loop?

shaka-player/lib/offline/storage.js

Line 581 in feefd7b

for (const stream of manifestDB.streams) {

externs/shaka/player.js

lib/offline/storage.js

joeyparrish · 2022-04-28T17:41:24Z

Should we continue with this PR to parallelize, or try to address the slow DB update instead? (Or both?)

Never mind that. I forgot that we had already discussed it and decided that this was an easy win. So let's finish this PR and then worry about the DB afterward.

Thank you for your hard work and analysis on this!

joeyparrish · 2022-04-28T17:45:18Z

I made two small edits so that we could merge this PR now. I hope you don't mind.

I'll wait for one more test pass, since I may have made a typo, and then I'll merge if it looks good. Thanks!

theodab · 2022-04-28T18:03:08Z

@joeyparrish I made multiple tests, and definitely what makes the download slow is https://github.com/shaka-project/shaka-player/blob/main/lib/offline/storage.js#L471

Ah, and that's called serially after each segment download, right? So we spend some of our "download time" actually waiting on database updates, so we will never utilize all available bandwidth.

Looks like this could use a redesign. I wish we had done some performance testing on it when this design was first proposed/implemented.

The theory was that to enable us to resume downloading if an app was closed, we would need to track each segment in the DB as it was downloaded. I don't recall exactly if that resume call was ever implemented, though. So maybe this is a waste.

That was specifically part of the changes that were going to make background fetch possible. But that's looking to have gone way down in priority, so maybe undoing the changes would be worthwhile.

joeyparrish · 2022-04-28T18:30:44Z

That was specifically part of the changes that were going to make background fetch possible. But that's looking to have gone way down in priority, so maybe undoing the changes would be worthwhile.

Right! Thanks for the reminder!

avelad requested a review from joeyparrish April 27, 2022 07:51

joeyparrish requested changes Apr 27, 2022

View reviewed changes

avelad force-pushed the parallel-downloads branch from 82f3f76 to c917145 Compare April 28, 2022 10:10

avelad changed the title ~~fix(offline): offline downloads are slow~~ feat(offline): improve the speed of offline downloads Apr 28, 2022

feat(offline): improve the speed of offline downloads

62f6bce

avelad force-pushed the parallel-downloads branch from c917145 to 62f6bce Compare April 28, 2022 11:19

joeyparrish requested changes Apr 28, 2022

View reviewed changes

externs/shaka/player.js Outdated Show resolved Hide resolved

lib/offline/storage.js Outdated Show resolved Hide resolved

joeyparrish added 2 commits April 28, 2022 10:42

Remove outdated comment about default setting

49c40fa

Simplify groupId update

0535a2c

joeyparrish approved these changes Apr 28, 2022

View reviewed changes

joeyparrish merged commit 73f6de3 into shaka-project:main Apr 28, 2022

github-actions bot mentioned this pull request Apr 28, 2022

chore(main): release 4.0.0 #4009

Merged

avelad deleted the parallel-downloads branch April 29, 2022 05:30

avelad added this to the v4.0 milestone May 4, 2022

github-actions bot added the status: archived Archived and locked; will not be updated label Jul 25, 2023

github-actions bot locked as resolved and limited conversation to collaborators Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(offline): improve the speed of offline downloads #4168

feat(offline): improve the speed of offline downloads #4168

avelad commented Apr 27, 2022

avelad commented Apr 27, 2022

joeyparrish Apr 27, 2022

joeyparrish Apr 27, 2022

avelad Apr 28, 2022

joeyparrish commented Apr 27, 2022

avelad commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

theodab commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

		groupId += 1000 + Math.floor(Math.random() *
		numberOfParallelSegmentsByStream);

feat(offline): improve the speed of offline downloads #4168

feat(offline): improve the speed of offline downloads #4168

Conversation

avelad commented Apr 27, 2022

avelad commented Apr 27, 2022

joeyparrish Apr 27, 2022

Choose a reason for hiding this comment

joeyparrish Apr 27, 2022

Choose a reason for hiding this comment

avelad Apr 28, 2022

Choose a reason for hiding this comment

joeyparrish commented Apr 27, 2022

avelad commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

joeyparrish commented Apr 28, 2022

theodab commented Apr 28, 2022

joeyparrish commented Apr 28, 2022