Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXT-X-DATERANGE metadata synchronisation vs video stream in presence of frequent EXT-X-DISCONTINUITYs #6203

Closed
5 tasks done
lnstadrum opened this issue Feb 9, 2024 · 6 comments · May be fixed by #6213
Closed
5 tasks done

Comments

@lnstadrum
Copy link

lnstadrum commented Feb 9, 2024

What version of Hls.js are you using?

1.5.4

What browser (including version) are you using?

Chromium 120.0.6099.71 (Official Build)

What OS (including version) are you using?

Linux Mint 21.2 Victoria (64-bit)

Test stream

https://hlsjs.video-dev.org/demo/?src=https%3A%2F%2Fbonksound.studio%2Fhls%2Fplaylist.m3u8&demoConfig=eyJlbmFibGVTdHJlYW1pbmciOnRydWUsImF1dG9SZWNvdmVyRXJyb3IiOnRydWUsInN0b3BPblN0YWxsIjpmYWxzZSwiZHVtcGZNUDQiOmZhbHNlLCJsZXZlbENhcHBpbmciOi0xLCJsaW1pdE1ldHJpY3MiOi0xfQ==

Configuration

{
  debug: false,
  maxBufferLength: 300
}

Additional player setup steps

This is not a playback issue: please do not expect any errors on the demo stream page.

The issue is related to the metadata track vs video track synchronization. To observe it we need (1) a playlist constructed in a special way (details follow and the test sample is provided), and (2) a little tooling, consisting in setting up a cuechange listener as follows:

// vanilla HLS instance setup
const video = document.getElementById("video")
const hls = new Hls()

hls.attachMedia(video)

hls.on(Hls.Events.MEDIA_ATTACHED, () => {

    // We are going to listen to cue changes and print something to the browser console.
    video.textTracks.addEventListener("addtrack", (event) => {

        event.track.addEventListener("cuechange", () => {
            // Grab the cue ID, which is (arbitrarily) built from
            // a useful part and a random suffix with a dot in-between.
            // The random suffix is only needed to ensure the uniqueness
            // of the IDs as required by the HLS specification.
            const id = video.textTracks[0].activeCues[0].id

            // Display the 'useful part' of the ID in the browser console:
            // the test stream is constructed in a way that the displayed text
            // should match the number shown in the video frame.
            console.log(id.split(".")[0])
        })

    })

    hls.loadSource("playlist.m3u8")

})

Additional details

(Apologies for the verbosity of what follows.)

The test stream consists of several pieces of content, of a few MPEG-TS fragments each. There is a timestamp discontinuity between the subsequent pieces, so EXT-X-DISCONTINUITY tag is inserted in the playlist.

For test purposes,

  • We have 7 video pieces looped in the playlist.
  • Every video piece only displays its own number.
  • There is a silent audio track.

Despite its synthetic appearance, this test stream actually comes from real data we work on. We replaced its content but we kept timestamps and stream durations unchanged.

In our application, we need to be able to identify which piece is being played when the user interacts with other elements in the page. As discussed here, there are several possibilities to solve that.

  • The simplest one would be to take the player's currentTime and map it to a position in the playlist. Such an estimate diverges quickly, likely because of a variable delay added in the playback time domain every time a discontinuity is encountered.
  • As suggested by @robwalch and other discussions on GitHub, we may put a EXT-X-PROGRAM-DATE-TIME after every discontinuity and then use playingDate property of the corresponding HLS instance to grab the "absolute" playback time instead of using the drifting currenTime. Similarly, we may listen to FRAG_CHANGED events and get directly the name of the fragment being played. While this works well, we do not see a way to get this working with native iOS HLS implementation, since playingDate and FRAG_CHANGED API are specific to HLS.js.

For this reason we try to push this approach a bit further. After every discontinuity and its associated EXT-X-PROGRAM-DATE-TIME we put a EXT-X-DATERANGE tag, to make a metadata record appearing in a textTrack. The date-range record has its start time matching to the PDT, a duration roughly matching the length of a particular piece, and an ID attribute carrying the piece ID. So our playlist is built of repeated sections as follows:

#EXT-X-DISCONTINUITY
#EXT-X-PROGRAM-DATE-TIME:2024-02-05T00:00:04.004000
#EXT-X-DATERANGE:ID=video_piece_id,START-DATE=2024-02-05T00:00:04.004000,DURATION=6.006,X-CUE=" "
#EXTINF:...
...
#EXTINF:...
...

We then can use the common browser API to listen to cuechange events on that textTrack to identify which video piece is being played. Since this is not specific to HLS.js, in theory we can expect this to work with the native HLS implementation in iOS as well. As far as EXT-X-PROGRAM-DATE-TIME is correctly inferred, and EXT-X-DATERANGE have the same start time as the PDT tags, we should be able to get the textTrack in sync with the video despite all the discontinuities.

The piece of JavaScript above allows to listen to these events and display the video piece ID in the browser console. We can then check whether the obtained ID matches the actual content being played.

So the issue is that those cuechange events actually come out of sync.

A few final notes

  • We did check that EXT-X-PROGRAM-DATE-TIME alone allows to identify the video piece being played using playingDate API without any drift, for a 12h long playlist.
  • I have no ability to check how a reference (native iOS) implementation behaves within this test. Maybe this text-video synchronization with frequent discontinuities is too much to ask. I would appreciate if someone can let me know. It would be a pity though, because HLS.js is aware of the exact absolute time and should be able to provide the perfectly synchronized metadata.

Checklist

Steps to reproduce

  1. Attach a cuechange event listener as described in the additional player setup steps.
  2. Open the JavaScript browser console.
  3. Start playback. Please be patient and do not seek through the video.
  4. Watch for the messages printed in the console: an integer number (the cue ID) from 1 to 7 is displayed every few seconds.
  5. When the number changes, look at what is displayed on the screen at that very moment (also a number from 1 to 7).

An instrumented demo is available here.

Expected behaviour

The number printed in the browser console matches the number in the video frame, e.g., the text metadata track is in-sync with the video track.

What actually happened?

All good at the beginning, but after ~1 min there is a noticeable delay between the number in the console and the number in the video frame, of about a second (the latter is delayed with respect to the former). It does not keep increasing indefinitely though, and seems to be related to maxBufferLength, i.e., increasing the latter makes the delay worse.

Console output

1
2
3
4
5
6
7
1
2
...

Chrome media internals output

No response

@lnstadrum lnstadrum added Bug Needs Triage If there is a suspected stream issue, apply this label to triage if it is something we should fix. labels Feb 9, 2024
@robwalch robwalch added Confirmed Stream Issue and removed Needs Triage If there is a suspected stream issue, apply this label to triage if it is something we should fix. labels Feb 9, 2024
@robwalch
Copy link
Collaborator

robwalch commented Feb 9, 2024

In HLS.js, DATERANGE tags are mapped to cues on the TextTrack timeline using playlist time (EXTINF durations). The drift occurs when media parsed and the parsed media duration differs from the duration in the playlist without the total program duration matching up once summed up. This is the case in your sample. The demo page "Timeline" tab shows the parsed segment duration slightly over 1.02s each vs the #EXTINF:1.001000 found in the playlist. This is not a result of DISCONTINUITY tags, but the playlist segment durations each being less than the corresponding parsed segment durations. Generally, HLS.js determines parsed segment duration as the difference between the starting video timestamp of a segment to the starting timestamp of the next.

You can access DateRange data directly in HLS.js using hls.levels[hls.currentLevel].details?.dateRanges. This is a map (Object) of all parsed DateRanges by ID. It's a much more complete and up-to-date collection of logical DateRanges - valid tags with the same ID are merged, and all attributes are available on the object. In v1.6 (with #6213) DateRanges will have a "tag anchor" that reference their adjacent fragment in the playlist so that their start time is always mapped to the PDT and discontinuity domain at that segment position on the playback timeline. The LEVEL_PTS_UPDATED event signals that segment times were updated based on parsed media timestamps and can be used to update app logic.

A fix for this metadata TextTrack specific issue still requires cue timing to be updated after media is parsed (on LEVEL_PTS_UPDATED) in the id3-track-controller. We can look into a fix for this in the next release. Even after the update I would recommend using the aforementioned LevelDetails dateRanges. Using cues in necessary is Safari HLS playback, but produces a cue for every attribute, is missing ID, and does not merge DateRange tags with the same ID.

@robwalch robwalch added this to the 1.6.0 milestone Feb 9, 2024
@robwalch
Copy link
Collaborator

robwalch commented Feb 9, 2024

Marking as enhancement. This is not a regression.

DateRange TextTrack cues are mapped to playlist time not the video track or parsed media, as of v1.5.x. When the playlist times differ this much from the parsed media (on every segment, not just discontinuity) I think it is fair to consider this a stream issue. To support Interstitials in v1.6, precise mapping of DateRanges to the playback timeline is crucial, so we can afford to compensate for these kind of discrepancies, and explore sliding cue start times (if we cannot adjust them after creating then we'll need to remove and add cues) at a usable interval.

@robwalch
Copy link
Collaborator

robwalch commented Feb 13, 2024

@lnstadrum,

Using cues in necessary in Safari HLS playback, but produces a cue for every attribute, is missing ID, and does not merge DateRange tags with the same ID.

FYI - The DateRanges in your playlist are invalid and while this should result in them being ignored, Apple HLS clients error and will not play the sample provided. ID and date attributes are expected to be provided as quoted-string values (...bonksound.studio/hls/playlist.m3u8 is missing quotes around ID and START-DATE values and errors rather than plays in Safari and Apple HLS clients).

Ex:

#EXT-X-DATERANGE:ID=4.16016,START-DATE=2024-02-05T00:00:16.016000,DURATION=4.004,X-CUE=" "

should be:

#EXT-X-DATERANGE:ID="4.16016",START-DATE="2024-02-05T00:00:16.016000",DURATION=4.004,X-CUE=" "

HLS.js is not strict about quoted-string format attribute values missing quotes. In #6213 I've added some validation logic that logs warning in to the console when missing quotes are encountered.

robwalch added a commit that referenced this issue Feb 14, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
@lnstadrum
Copy link
Author

Hi @robwalch,

Thank you for looking into this.

  • In our application, we build a compilation of videos consisting of parts of existing HLS streams. A "part" is a consecutive subset of segments of another stream (which does not necessarily include the last segment of that stream). The original streams are produced by ffmpeg (so are EXTINF values). To build the compilation we simply copy the segments with their corresponding durations without properly adjusting duration of segments which fall before EXT-DISCONTINUITYs in the compiled stream. So I guess it is rather an issue with our stream indeed. On the other hand, it would be very convenient for us to avoid querying and storing the "accurate" duration (not sure if this is a proper term) of every segment which may potentially be the last one in the corresponding part of the compilation, as far as it makes the process of building the compilation playlist and the logistics of storing the original streams in the DB more complex, and we actually do not know in advance which segment are concerned (the compilations are built on the fly).
  • Thank you for pointing the quotes issue. I will fix this in the test stream. I haven't had yet the chance to test playback using the native iOS implementation, but will soon do.

@robwalch
Copy link
Collaborator

robwalch commented Feb 17, 2024

@Instadrum,

I just added 0c5ca21 to #6213 which updates cue start and end times on PTS update. Let me know if this works for you:

https://feature-date-range-parsing.hls-js-4zn.pages.dev/demo/?src=https%3A%2F%2Fbonksound.studio%2Fhls%2Fplaylist.m3u8

Just confirmed that Safari does not shift cues to align with media this way, so you'll want to make sure your Playlist EXTINF durations align with presentation time in Apple HLS clients (or relative to Safari's HTMLMediaElement.getStartDate()). Providing unmuxed ISO BMFF segmented HLS Playlists may help as then your main playlist has only video segments and there is no start offset when segmented based on audio time or whatever track starts or ends first or last.

I don't think it hurts for HLS.js to make these adjustments as it may push the timeline out based on audio priming delays, overlaps not allowed in MSE, and other presentation time oddities the library has picked up on the way. This change will ensure we keep cues aligned with the libraries interface for DateRange and Program-Date-Time mapping and hls.playingDate.

robwalch added a commit that referenced this issue Feb 19, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
robwalch added a commit that referenced this issue Feb 21, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
robwalch added a commit that referenced this issue Feb 21, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
robwalch added a commit that referenced this issue Feb 21, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
robwalch added a commit that referenced this issue Feb 21, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
@lnstadrum
Copy link
Author

Hi @robwalch,

Thanks you, the test branch works as expected on our test stream.

We will indeed have to find a solution for Safari. There actually is a silent audio track in our test stream (as we do have audio in our streams in production), and audio and video tracks do not end at the same moment prior to discontinuity, or do not start at the same moment right after, due to the way our compiled stream is built.

I guess this issue can now be closed. Thanks a lot again for your help!

robwalch added a commit that referenced this issue Mar 1, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
robwalch added a commit that referenced this issue Apr 11, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
robwalch added a commit that referenced this issue Apr 22, 2024
Warn on invalid quoted-attribute attributes (missing quotes)
Related to #6203
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants