New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(offline): improve the speed of offline downloads #4168
Conversation
Examples in Demo app: Angel One (multicodec, multilingual)
Sintel 4k (multicodec)
Sintel w/ trick mode (MP4 only, 720p)
Tears of Steel (multicodec, TTML)
|
lib/offline/storage.js
Outdated
const numberOfParallelSegmentsByStream = | ||
config.offline.numberOfParallelSegmentsByStream; | ||
if (numberOfParallelSegmentsByStream > 1) { | ||
// Note: 1000 is a random number to not coincide with other groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It's not random, it's arbitrary. And you should note that 1000 is safe so long as there are fewer than 1000 actual streams.
lib/offline/storage.js
Outdated
groupId += 1000 + Math.floor(Math.random() * | ||
numberOfParallelSegmentsByStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the goal is to have, say, 5 parallel tracks of audio and 5 of video, this won't do that. For example:
video stream, id 1 => 1001 - 1005
audio stream, id 2 => 1002 - 1006
audio stream, id 3 => 1003 - 1007
text stream, id 4 => 1004 - 1008
So there will be mixing between groups.
However, why not mix between groups? Why not assign each segment to a random group in the range of the config variable?
const groupId = Math.floor(
Math.random() * numberOfParallelSegmentsByStream);
Now that I'm looking at the problem this way, it may be better to change the config from numberOfParallelSegmentsByStream
to numberOfParallelDownloads
. It's not "by stream" any more, nor did it ever really need to be.
The consistency of your results would also improve if:
- groupIds were assigned in a round-robin manner instead of purely random
- the default were set to something like 4 or 5 instead of 1
I would then call this a "feat" instead of a "fix", since the default would be different than the old behavior. But feat vs fix here is somewhat subjective. I wouldn't insist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
This is a wonderfully simple solution, but I wonder if we can do better than ~20% by coalescing requests. The ramp-up time for a TCP stream can be significant, and requesting segments over and over can make it difficult to achieve the ideal throughput. That said, I would still take this fix as an intermediate solution, as it's very cheap, and ~20% is a definite improvement. |
82f3f76
to
c917145
Compare
c917145
to
62f6bce
Compare
@joeyparrish I made multiple tests, and definitely what makes the download slow is https://github.com/shaka-project/shaka-player/blob/main/lib/offline/storage.js#L471 |
Ah, and that's called serially after each segment download, right? So we spend some of our "download time" actually waiting on database updates, so we will never utilize all available bandwidth. Looks like this could use a redesign. I wish we had done some performance testing on it when this design was first proposed/implemented. The theory was that to enable us to resume downloading if an app was closed, we would need to track each segment in the DB as it was downloaded. I don't recall exactly if that resume call was ever implemented, though. So maybe this is a waste. Should we continue with this PR to parallelize, or try to address the slow DB update instead? (Or both?) Is it slow because of calls like this loading a lot of data? shaka-player/lib/offline/storage.js Line 575 in feefd7b
Or this big double-loop? shaka-player/lib/offline/storage.js Line 581 in feefd7b
|
Never mind that. I forgot that we had already discussed it and decided that this was an easy win. So let's finish this PR and then worry about the DB afterward. Thank you for your hard work and analysis on this! |
I made two small edits so that we could merge this PR now. I hope you don't mind. I'll wait for one more test pass, since I may have made a typo, and then I'll merge if it looks good. Thanks! |
That was specifically part of the changes that were going to make background fetch possible. But that's looking to have gone way down in priority, so maybe undoing the changes would be worthwhile. |
Right! Thanks for the reminder! |
Add the possibility to configure the number of downloads in parallel per stream. If you put a higher value the download time can decrease.
Related to #4166