Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Third attempt for a preload API #1407

Open
wants to merge 13 commits into
base: misc/imediaelement
Choose a base branch
from

Conversation

peaBerberian
Copy link
Collaborator

@peaBerberian peaBerberian commented Mar 15, 2024

STATUS: This is heavily in development, and just set-up as a Pull Request to offer more visibility / ease exchanges between RxPlayer maintainers. The public API-part of this feature is not yet decided, as we're still trying many different kinds of technical implementations first

EDIT (2024-03-18): I finally chose to implement the buffering on the SegmentSink-side, as it involves fewer modules than other attempts and is more easily translatable to the multithread mode.


We lately worked a little on what could a preload API and implementation look like.
Requirements for that preload API is that it needs to not rely on a video element tag nor MSE and EME API, and that is should be possible to preload a content while another is already playing.

After multiple attempts that revealed themselves to be too ambitious for now, I'm right now looking if we couldn't just implement a reload API by plugging ourselves on our regular loading logic and providing a few changes:

  • When starting a new content, we create a ContentInitializer linked to that content. This is a module starting everything that needs to be started to play a content (fetching the content's Manifest, creating the MediaSource, and the right modules to begin playback).

    Here I'm trying to allow it to start even without a media element nor a MediaSourceInterface instance (our abstraction over the browser's MSE API). In that case, it would actually preload media segments in-memory, by creating new specially-purposed SegmentSinks, which are the module on which media segments are pushed.
    Moreover the "media observations" sent by a PlaybackObserver would always correspond to the initially-wanted position when no media element is involved.

    This is how "preloading" part would be set-up.

  • When finally really loading the content, we would give to the ContentInitializer the real video element, and it would begin to setup modules and logic relying on it (listening to media errors, performing the initial seek and autoplay etc.) as well as creating a MediaSource.

    It would also trigger a "reloading" step, which restart most content-linked submodules, but also setting up a softReload boolean which indicates that the media data should be preserved if possible.

There are still many complexities here (though it is much simpler than other previous solutions):

  • I have no idea yet of how we will handle EME API (for DRMs) here. We would guess it would be very interesting to be able to pre-load licenses, but there is a lot of complexity that can show up due to the fact that we may also be currently playing another content on the media element. For now, I'm only working on clear contents.

  • We need to have a strategy to avoid keeping potentially too much data in memory.

  • I'm for now only working in monothread scenarios. But a good solution should also work in multi-thread scenarios

@peaBerberian peaBerberian force-pushed the feat/preload3 branch 14 times, most recently from fca04f7 to a715237 Compare March 18, 2024 14:27
@peaBerberian peaBerberian changed the title [WIP] Third attempt for a reload API [WIP] Third attempt for a preload API Mar 20, 2024
We changed the name of classes from *SegmentBuffer to *SegmentSink but
didn't actually update the file names. This is now done.
To avoid the confusion between several concepts named "settings" in the
`MediaSourceContentInitializer`, the settings provided to its
constructor is now renamed to `_initSettings` (aligns with
`_initCanceller` for its TaskCanceller initialized at instantiation).
Some private methods of the `MediaSourceContentInitializer` were
difficult to follow due to the heavy usage of inner functions with
unclear names (such as `recursivelyLoadOnMediaSource`).

I now tried to isolate some of those and rename the different methods
called at content load so it makes more sense. Methods in order are:

```
start > _setupInitialMediaSourceAndDecryption >
_onInitialMediaSourceReady > _setupContentWithNewMediaSource >
_startLoadingContentOnMediaSource
```

Note: `_setupContentWithNewMediaSource` and
`_startLoadingContentOnMediaSource` are still similarly named, though
the former just has the task of setting up a reloading logic and then
calling the latter.

Note2: `_startLoadingContentOnMediaSource` is still a huge and complex
method that we might improve on in the future.
This is a minor code update to make the NativeTextDisplayer's clean-up
easier to follow by isolating the DOM-related (TextTrack,
HTMLTrackElement, HTMLMediaElement) code inside its own private method.
…ctions

The callback called on the `"updateend"` and `"error"` events on a
SourceBuffer were previously declared as arrow functions inside the
constructor.

This allowed to keep a clear JS context for the `this` keyword, but it
made the `MainSourceBufferInterface`'s constructor harder to read.

I found it more readable to declare both callback as private methods
instead, with the drawback of having to bind the `this` explictely.
Our `SyncOrAsync` util is used for cases where a task is >99% of the time
synchronous, yet <1% of the time asynchronous, and where we don't want to
incur any overhead/supplementary complexity of awaiting a Promise which
inherently schedule a microtask when the value is most probably already
there.

It worked well for most usages, but it turns out that a task that starts
as asynchronous would then always lead to the need to rely on Promises,
even once the task is finished (basically, if it started as an "async
value" it could never transform itself to a "sync value").

This does not create any issue but we could gain some minuscule advantage
(well, we could argue that `SyncOrAsync`'s advantage is in itself
minuscule) here by just relying on the value synchronously once the task is
finished.
This is a proof-of-concept to see if adding a mock implementation of
the video HTML5 API and MSE API subset used by the RxPlayer inside the
code of the RxPlayer seems useful and maintainable.

Motivation
----------

Initially, the idea was to implement one of the several solutions we
have in mind to pre-load a future content, this was the most far-fetched
solution, but I thought that it could still make enough sense to be
tried.
In this solution, the application would create two player instances:

  1. One linked to the true media element on the page, as usual

  2. The Other linked to our dummy media element which would preload
     and store locally (in memory? through storage APIs when available?)
     loaded contents.

     Note that here nothing will change in the RxPlayer API, it is just
     that the application will have provided to that instance our mocked
     media element - which would implement all that storage logic in its
     implementation of MSE API - instead of a regular one. This is to
     ensure a very minimal modification of the core RxPlayer code.

When the application judged that playback should begin, it can get the
preloaded data through an API of that dummy media element, call `stop`
on the RxPlayer instance with that dummy element and give the preloaded
data as a supplementary `loadVideo` option (`preloadedData`?) to the
RxPlayer instance with the real media element.

There are several complexities that are not yet handled here: most of
all we need to be careful as segments on that dummy implementation will
for now be stored in memory. Also, we will also need to provide
segment-identification metadata alongside the preloaded data so the
"real" RxPlayer is able to recognize which Adaptation/Representation has
already been loaded so that instance doesn't try to replace it.

Also a potential use case asked by applications is to preload a content
while another one is already loading. With how this solution is
currently implemented, this wouldn't be efficient, as both instances could
be loading media data at the same time without priorization mechanisms
(e.g. we would imagine that the currently-loaded content is more
important) - though I imagine this could be implemented in some way.

Other uses
----------

While doing a skeleton of it, I've realized that the MSE API outside of
segment parsing and decoding was relatively straightforward: throw when
the state or arguments is not right, send the right events etc., so I
thought that a second use case (which may well in final be our first use
case!) would be for testing.

Thanks to this implementation:

  1. we could just push fake generated content to facilitate writing
     tests (no need to generate a real content linked to the wanted
     behavior for each test)

  2. we could more easily replicate and test MSE implementation bugs
     seen on other devices and ensure they keep being tested.

Another nice use of it is that it implement a good chunk of the API used
by the RxPlayer that are only found in browser environments, which may
simplify PoCs in more restricted environments which can still run JS.
In a recent PoC (Proof Of Concept), I attempted to replicate a subset of
the HTMLMediaElement and MSE APIs without the decoding part to facilitate
the implementation of some advanced features and integration tests.

It has shown potential, and though it may be a little too soon to merge
and rely on that development, I propose here to merge a component of it
that can be useful in multiple ways.

The idea is to restrict some key browser API (`HTMLMediaElement`,
`MediaSource` and `SourceBuffer`) types by providing our own
browser-compatible (this compatibilty is checked at compile-time through
some TypeScript trick) type definitions but with either optional
supplementary methods and attributes or compatible updated definitons
(e.g. a method whose return type was only an enumeration of some values
could now return even more values).

For example, for `IMediaElement` (the redefinition of
`HTMLMediaElement`), I added the vendor-prefixed events and methods that
may be used by the RxPlayer (that were previously in the
`ICompatHTMLMediaElement` type) - such as the `webkitSetMediaKeys`
method.
Here this allows to make TypeScript nicer to us when we're exploiting
webkit/moz/ms-prefixed API for example.

I also added to it the optional `FORCED_MEDIA_SOURCE` attribute allowing
to define a custom MSE implementation when relying on a given media
element, though we could also remove that part for this PR.

Another key advantage is that the subset of MSE-related API that are
relied-on by the RxPlayer are now clearly listed in a single file, which
can be useful when debugging, making API-interaction changes and/or when
refactoring.
This was done as it makes more sense to have an "idle" state after
stopping than disposing (in the latter case, we would have to add
something like the "disposed" state) and as it may be more portable to
allow re-using a ContentInitializer later, which however for now hasn't
any real case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant