Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start playing offline content before download is complete #6244

Open
loicraux opened this issue Feb 12, 2024 · 5 comments
Open

Start playing offline content before download is complete #6244

loicraux opened this issue Feb 12, 2024 · 5 comments
Labels
component: offline The issue involves the offline storage system of Shaka Player priority: P3 Useful but not urgent type: enhancement New feature or request
Milestone

Comments

@loicraux
Copy link
Contributor

Have you read the Tutorials?
Yes

Have you read the FAQ and checked for duplicate open issues?
Yes

If the question is related to FairPlay, have you read the tutorial?

N/A

What version of Shaka Player are you using?
Latest

What browser and OS are you using?
Chrome

Please ask your question

Is it today --- with the latest version of shaka player --- possible to start playing some content that is being downloaded using shaka.offline.Storage API ? That is to say : to start playing the content before the download is complete ?

If not, would you consider this to be an easy, moderately difficult or difficult feature to be added to Shaka Player ?

It's a subject we're very interested in, and I'd be willing to help out if needed with a PR if given the time and some implementation directions/advices (I'm not familiar with shaka player design/code).

@loicraux loicraux added the type: question A question from the community label Feb 12, 2024
@joeyparrish
Copy link
Member

I think it would take a fair bit of research in the Shaka code. You could pretty easily change the way we write to the database to make content discoverable before it's completely downloaded, and provide early access to the assigned ID.

However, the hard part (I expect) would be making sure the player behavior at the edge of the download is reasonable.

It might be enough to:

  1. Write enough metadata upfront to create a complete manifest, referencing segments that haven't been downloaded yet
  2. Add a field to mark offline content as incomplete, for clarity and to enable resuming incomplete downloads later
  3. Make the callback that loads segments able to block until a segment has been written, in case playback catches up to download progress. (I'm not sure what signal you would use to know when the segment was complete. Ideally, you'd find a way to avoid polling the database.)

To allow downloads to resume later, you might also want to store (temporarily) a list of segment URLs & byte ranges.

@joeyparrish joeyparrish added type: enhancement New feature or request component: offline The issue involves the offline storage system of Shaka Player priority: P3 Useful but not urgent and removed type: question A question from the community labels Feb 12, 2024
@shaka-bot shaka-bot added this to the Backlog milestone Feb 12, 2024
@loicraux
Copy link
Contributor Author

I think it would take a fair bit of research in the Shaka code. You could pretty easily change the way we write to the database to make content discoverable before it's completely downloaded, and provide early access to the assigned ID.

However, the hard part (I expect) would be making sure the player behavior at the edge of the download is reasonable.

It might be enough to:

  1. Write enough metadata upfront to create a complete manifest, referencing segments that haven't been downloaded yet
  2. Add a field to mark offline content as incomplete, for clarity and to enable resuming incomplete downloads later
  3. Make the callback that loads segments able to block until a segment has been written, in case playback catches up to download progress. (I'm not sure what signal you would use to know when the segment was complete. Ideally, you'd find a way to avoid polling the database.)

To allow downloads to resume later, you might also want to store (temporarily) a list of segment URLs & byte ranges.

Thank you for your reply. It allows me to assess the work that actually needs to be done and the difficulty that actually revolves around the reader's behavior when the playhead reaches the limit of what has already been downloaded. If this becomes a pressing need on my part, I can at least already give an answer.

Regarding the second point (Add a field to mark offline content as incomplete), this is something I'm already planning to do: mark downloads IN PROGRESS during download (in the metadatas?) and COMPLETED once fully downloaded, so as to be able to clean up downloads that would be interrupted (and therefore currently invalid) by a machine (or network?) outage... I can abort & remove ongoing downloads when the (Electron) app is exited, but I cannot deal with the app being abruptly killed, hence the need for such flags and cleanup work at startup I think. I'm not sure what Shaka Player's current behavior is in this case: is Storage.remove(contentUri) API capable of correctly deleting from the disk an interrupted download ?

@joeyparrish
Copy link
Member

Regarding the second point (Add a field to mark offline content as incomplete), this is something I'm already planning to do: mark downloads IN PROGRESS during download (in the metadatas?) and COMPLETED once fully downloaded, so as to be able to clean up downloads that would be interrupted (and therefore currently invalid) by a machine (or network?) outage... I can abort & remove ongoing downloads when the (Electron) app is exited, but I cannot deal with the app being abruptly killed, hence the need for such flags and cleanup work at startup I think.

Yes, please! I found a WIP local branch where I started working on a cleanup feature, but I lost track of it and the last commit is from April 2022. It is not exactly as you described, though. Rather than mark incomplete/complete, it identifies orphaned segments in the database to delete them. I've uploaded it here, in case you want to use any part of it: https://github.com/joeyparrish/shaka-player/tree/wip-offline-orphans You have my permissions to use any part of that, or none of it.

I'm not sure what Shaka Player's current behavior is in this case: is Storage.remove(contentUri) API capable of correctly deleting from the disk an interrupted download ?

No, I don't believe so. I believe the segments get orphaned, but that's based on my memory alone and not any recent review of the code. What I recall is that we first write segments, because segment IDs are generated as each one is inserted. Then we write the database equivalent of the manifest, which references those segment IDs. So if the process is interrupted, some number of segments are written, but no manifest references them.

So in addition to adding a field for "complete" status, the manifest would have to be written before the segments. And what would be ideal for playback is if the list of segments were written upfront, so that the player could know how to request segments that are not yet written. So you may also want to have some way to reserve segment IDs in advance, or to construct them in some formulaic way. (I don't recall how they are done today. If it's an auto-increment database ID, that could be an issue.)

At least, that's what I recall. I haven't worked on that area of the code in a while, and it has been years since I have been able to keep up with every single change to the project. (In fact, I've just come back from 3 months on leave, so I've missed a ton of recent changes!)

@theodab
Copy link
Collaborator

theodab commented Feb 15, 2024

Some of the things Joey cited have actually been done already, as part of the prework for the background fetch change that ended up being put on the backburner (#879).

  1. Write enough metadata upfront to create a complete manifest, referencing segments that haven't been downloaded yet

The call to store creates an initial mostly-empty manifest, and then each as segments are downloaded they can be amended to the offline manifest using shaka.offline.Storage.assignSegmentsToManifest. I don't know how much work it'd take to get that initial manifest to be playable (it's been years since I last worked on this), but it's a start.

Allowing for the playback of incomplete manifests would probably require changing how shaka.offline.OfflineManifestParser works.

  1. Add a field to mark offline content as incomplete, for clarity and to enable resuming incomplete downloads later

A flag for this exists.

* @property {boolean} isIncomplete
* If true, the content is still downloading. Manifests with this set cannot
* be played yet.

I suppose the JSDoc would need to be amended if we make it so that incomplete manifests can be played after all.

  1. Make the callback that loads segments able to block until a segment has been written, in case playback catches up to download progress. (I'm not sure what signal you would use to know when the segment was complete. Ideally, you'd find a way to avoid polling the database.)

This part, on the other hand, would definitely require work.

@loicraux
Copy link
Contributor Author

loicraux commented Mar 8, 2024

For anyone interested, I ended up adding this piece of code to my app, since I don't have enough time to integrate @joeyparrish 's WIP local branch in a PR to Shaka Player at the moment. This is run at startup of the app, and works fine
(I guess I don't need to add the initSegmentKey to the set of referred segments keys, but this does no harm...) :

Note that this is important that a call to storage.list() is done at least once prior to cleaning up the database by calling removeOrphanedSegmentsInShakaOfflineDB, since we don't want removeOrphanedSegmentsInShakaOfflineDB to create the database and tables as side-effects if they don't already exist (see comment below). removeOrphanedSegmentsInShakaOfflineDB would create stores using out-of-line keys where as shaka-player storage API uses in-line (generated) keys => This would lead to errors later on when shaka-player populates the stores with data w/o providing keys !!

This call to storage.list() will properly create the shaka_offline_db database and the manifest-v5 and segment-v5 tables in this database if they don't already exist.

import { createStore, values, keys, delMany } from 'idb-keyval';
import { z } from 'zod';

import { logDebug, logError, logInfo } from '@utils/logger';
import { assert } from '@utils/assertions';
import { isDefined } from '@utils/utils';

/**
 * Schemas are based on typedef in shaka-player/externs/shaka/offline.js
 */

const segmentDBSchema = z
    .object({
        /**
         * The storage key where the init segment is found; null if no init segment
         */
        initSegmentKey: z.number().optional(),

        /**
         * The key to the data in storage
         */
        dataKey: z.number()
    })
    .readonly();

const streamDBSchema = z
    .object({
        /**
         * An array of segments that make up the stream.
         */
        segments: z.array(segmentDBSchema).readonly()
    })
    .readonly();

const manifestDBSchema = z
    .object({
        /**
         * If true, the content is still downloading.
         */
        isIncomplete: z.boolean().optional(),

        /**
         * The Streams that are stored.
         */
        streams: z.array(streamDBSchema).readonly()
    })
    .readonly();

type ManifestDB = z.infer<typeof manifestDBSchema>;

function isManifestDB(value: unknown): value is ManifestDB {
    return manifestDBSchema.safeParse(value).success;
}

/**
 * WARNING: This must obviously match the names of the IndexedDB database and tables used by Shaka Player !
 * This is a hack until shaka player database is properly cleaned up (no orphaned data, no old data, etc) by
 * the library itself !
 */
const shakaOfflineDBName = 'shaka_offline_db';
const shakaOfflineManifestsTableName = 'manifest-v5';
const shakaOfflineSegmentsTableName = 'segment-v5';

/**
 * Cleans the Shaka offline database by removing orphaned segments
 * (segments that are not referred by any manifest).
 * This may need to be done periodically if foreground storage operations
 * have been interrupted by closing the page, for example.
 * Also aborted downloads may leave orphaned segments.
 */
export async function removeOrphanedSegmentsInShakaOfflineDB(): Promise<void> {
    // Warning: This would create the database and stores if they don't already exist !!
    const manifestsStore = createStore(shakaOfflineDBName, shakaOfflineManifestsTableName);
    const segmentsStore = createStore(shakaOfflineDBName, shakaOfflineSegmentsTableName);

    logInfo(`Cleaning orphaned segments in ${shakaOfflineDBName} database...`);

    let allSegmentsKeys: number[] = [];
    try {
        allSegmentsKeys = await keys<number>(segmentsStore);
        logDebug(`Found ${allSegmentsKeys.length} segments in ${shakaOfflineSegmentsTableName} table of ${shakaOfflineDBName} database`);
    } catch (error: unknown) {
        logWarning(
            `Failed to get all segments from ${shakaOfflineSegmentsTableName} table of ${shakaOfflineDBName} database. Error is ${
                error instanceof Error ? error.message : 'unknown'
            }. Maybe the table does not exist yet ?`,
            error
        );
        return;
    }

    const valuesInManifestsStore: unknown[] = await values(manifestsStore);
    const manifests: ManifestDB[] = valuesInManifestsStore.filter(isManifestDB);
    assert(manifests.length === valuesInManifestsStore.length, 'Invalid data found in the Shaka offline database ?');
    const allReferredSegmentsKeys = new Set<number>(
        manifests
            .flatMap(({ streams }) => streams)
            .flatMap(({ segments }) => segments)
            .flatMap(({ initSegmentKey, dataKey }) => [initSegmentKey, dataKey].filter(isDefined))
    );
    logDebug(
        `Found ${allReferredSegmentsKeys.size} referred segments in ${shakaOfflineManifestsTableName} table of ${shakaOfflineDBName} database`
    );

    // Note: Since Set.prototype.difference() implementation is available only in Chrome >= 122
    // (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set/difference),
    // and we are using Chrome 120 at the time of writing this code, we are manually computing the difference :
    const orphanedSegmentsKeys = allSegmentsKeys.filter((x) => !allReferredSegmentsKeys.has(x));
    if (orphanedSegmentsKeys.length === 0) {
        logInfo(`No orphaned segments found in ${shakaOfflineSegmentsTableName} table of ${shakaOfflineDBName} database`);
        return;
    }

    logInfo(
        `Found ${orphanedSegmentsKeys.length} orphaned segments in ${shakaOfflineSegmentsTableName} table of ${shakaOfflineDBName} database`
    );
    try {
        await delMany(orphanedSegmentsKeys, segmentsStore);
    } catch (error: unknown) {
        logError(
            `Failed to delete ${
                orphanedSegmentsKeys.length
            } orphaned segments from ${shakaOfflineSegmentsTableName} table of ${shakaOfflineDBName} database. Error is ${
                error instanceof Error ? error.message : 'unknown'
            }`,
            error
        );
        return;
    }

    logInfo(
        `Successfully deleted ${orphanedSegmentsKeys.length} orphaned segments from ${shakaOfflineSegmentsTableName} table of ${shakaOfflineDBName} database`
    );
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: offline The issue involves the offline storage system of Shaka Player priority: P3 Useful but not urgent type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants