Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(offline): Speed up offline storage by ~87% #4176

Merged
merged 3 commits into from Apr 29, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
85 changes: 51 additions & 34 deletions lib/offline/storage.js
Expand Up @@ -196,7 +196,7 @@ shaka.offline.Storage = class {
goog.asserts.assert(typeof(config) == 'object', 'Should be an object!');

goog.asserts.assert(
this.config_, 'Cannot reconfigure stroage after calling destroy.');
this.config_, 'Cannot reconfigure storage after calling destroy.');
return shaka.util.PlayerConfiguration.mergeConfigObjects(
/* destination= */ this.config_, /* updates= */ config );
}
Expand Down Expand Up @@ -441,36 +441,34 @@ shaka.offline.Storage = class {
async downloadSegments_(
toDownload, manifestId, manifestDB, downloader, config, storage,
manifest, drmEngine) {
let pendingManifestUpdates = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to these pending manifest updates if we throw due to tripping the this.ensureNotDestroyed_()? If that happens, they might not be added to the manifest, so the static cleanup code won't know to clean them up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what would happen now is:

  1. Downloaded segments are stored in the database
  2. Those segments are orphaned and not referred to in any manifest

That can also happen if the user closes the page, though.

Let me look at two things in a follow-up:

  1. How to deal with orphaned segments (which can happen without cancellation)
  2. How to clean up segments in the same session on abort()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

let pendingDataSize = 0;

/**
* @param {!Array.<!shaka.offline.DownloadInfo>} toDownload
* @param {boolean} updateDRM
*/
const download = async (toDownload, updateDRM) => {
const throwIfAbortedFn = () => {
this.ensureNotDestroyed_();
};
for (const download of toDownload) {
/** @param {?BufferSource} data */
let data;
const request = download.makeSegmentRequest(config);
const estimateId = download.estimateId;
const isInitSegment = download.isInitSegment;
const onDownloaded = (d) => {
data = d;
return Promise.resolve();
};
downloader.queue(download.groupId,
request, estimateId, isInitSegment, onDownloaded);
downloader.queueWork(download.groupId, async () => {
goog.asserts.assert(data, 'We should have loaded data by now');
goog.asserts.assert(data instanceof ArrayBuffer,
'The data should be an ArrayBuffer');

const onDownloaded = async (data) => {
// Store the data.
const dataKeys = await storage.addSegments([{data}]);
this.ensureNotDestroyed_();

// Store the necessary update to the manifest, to be processed later.
const ref = /** @type {!shaka.media.SegmentReference} */ (
download.ref);
manifestDB = (await shaka.offline.Storage.assignStreamToManifest(
manifestId, ref, {data}, throwIfAbortedFn)) || manifestDB;
});
const id = shaka.offline.DownloadInfo.idForSegmentRef(ref);
pendingManifestUpdates[id] = dataKeys[0];
pendingDataSize += data.byteLength;
};

downloader.queue(download.groupId,
request, estimateId, isInitSegment, onDownloaded);
}
await downloader.waitToFinish();

Expand All @@ -496,12 +494,30 @@ shaka.offline.Storage = class {
await download(toDownload.filter((info) => info.isInitSegment), true);
this.ensureNotDestroyed_();
toDownload = toDownload.filter((info) => !info.isInitSegment);

// Copy these and reset them now, before calling await.
const manifestUpdates = pendingManifestUpdates;
const dataSize = pendingDataSize;
pendingManifestUpdates = {};
pendingDataSize = 0;

manifestDB =
(await shaka.offline.Storage.assignStreamToManifest(
manifestId, manifestUpdates, dataSize,
() => this.ensureNotDestroyed_())) || manifestDB;
this.ensureNotDestroyed_();
}

if (!usingBgFetch) {
await download(toDownload, false);
this.ensureNotDestroyed_();

manifestDB =
(await shaka.offline.Storage.assignStreamToManifest(
manifestId, pendingManifestUpdates, pendingDataSize,
() => this.ensureNotDestroyed_())) || manifestDB;
this.ensureNotDestroyed_();

goog.asserts.assert(
!manifestDB.isIncomplete, 'The manifest should be complete by now');
} else {
Expand Down Expand Up @@ -538,19 +554,18 @@ shaka.offline.Storage = class {
* changes of the other.
*
* @param {number} manifestId
* @param {!shaka.media.SegmentReference} ref
* @param {shaka.extern.SegmentDataDB} data
* @param {!Object.<string, number>} manifestUpdates
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSDoc description for this function is now out of date. It should now talk about manifestUpdates instead of ref, for instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs updated.

* @param {number} dataSizeUpdate
* @param {function()} throwIfAbortedFn A function that should throw if the
* download has been aborted.
* @return {!Promise.<?shaka.extern.ManifestDB>}
*/
static async assignStreamToManifest(manifestId, ref, data, throwIfAbortedFn) {
static async assignStreamToManifest(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The method should probably be called assignStreamsToManifest now that it can do multiple at once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to assignSegmentsToManifest, since it is now only setting the database keys of the segments. The streams are already part of the manifest.

manifestId, manifestUpdates, dataSizeUpdate, throwIfAbortedFn) {
/** @type {shaka.offline.StorageMuxer} */
const muxer = new shaka.offline.StorageMuxer();

const idForRef = shaka.offline.DownloadInfo.idForSegmentRef(ref);
let manifestUpdated = false;
let dataKey;
let activeHandle;
/** @type {!shaka.extern.ManifestDB} */
let manifestDB;
Expand All @@ -561,11 +576,6 @@ shaka.offline.Storage = class {
await muxer.init();
activeHandle = await muxer.getActive();

// Store the data.
const dataKeys = await activeHandle.cell.addSegments([data]);
dataKey = dataKeys[0];
throwIfAbortedFn();

// Acquire the mutex before accessing the manifest, since there could be
// multiple instances of this method running at once.
mutexId = await shaka.offline.Storage.mutex_.acquire();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this mutex anymore? Since any given store operation is only going to be running this method once or twice anyway, and in the two-calls case they definitely don't happen at the same time. So there's no risk of a given storage entry being mutated by two different threads at once.

And if we get rid of this mutex, we could get rid of the shaka.util.Mutex class too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed Mutex.

Expand All @@ -580,19 +590,25 @@ shaka.offline.Storage = class {
let complete = true;
for (const stream of manifestDB.streams) {
for (const segment of stream.segments) {
if (segment.pendingSegmentRefId == idForRef) {
let dataKey = segment.pendingSegmentRefId ?
manifestUpdates[segment.pendingSegmentRefId] : null;
if (dataKey != null) {
segment.dataKey = dataKey;
// Now that the segment has been associated with the appropriate
// dataKey, the pendingSegmentRefId is no longer necessary.
segment.pendingSegmentRefId = undefined;
}
if (segment.pendingInitSegmentRefId == idForRef) {

dataKey = segment.pendingInitSegmentRefId ?
manifestUpdates[segment.pendingInitSegmentRefId] : null;
if (dataKey != null) {
segment.initSegmentKey = dataKey;
// Now that the init segment has been associated with the
// appropriate initSegmentKey, the pendingInitSegmentRefId is no
// longer necessary.
segment.pendingInitSegmentRefId = undefined;
}

if (segment.pendingSegmentRefId) {
complete = false;
}
Expand All @@ -603,7 +619,7 @@ shaka.offline.Storage = class {
}

// Update the size of the manifest.
manifestDB.size += data.data.byteLength;
manifestDB.size += dataSizeUpdate;

// Mark the manifest as complete, if all segments are downloaded.
if (complete) {
Expand All @@ -617,11 +633,12 @@ shaka.offline.Storage = class {
} catch (e) {
await shaka.offline.Storage.cleanStoredManifest(manifestId);

if (activeHandle && !manifestUpdated && dataKey) {
if (activeHandle && !manifestUpdated) {
const dataKeys = Object.values(manifestUpdates);
// The cleanStoredManifest method will not "see" any segments that have
// been downloaded but not assigned to the manifest yet. So un-store
// them separately.
await activeHandle.cell.removeSegments([dataKey], (key) => {});
await activeHandle.cell.removeSegments(dataKeys, (key) => {});
}

throw e;
Expand Down