Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio stream segments not of identical duration #1177

Open
aradalvand opened this issue Feb 20, 2023 · 4 comments
Open

Audio stream segments not of identical duration #1177

aradalvand opened this issue Feb 20, 2023 · 4 comments

Comments

@aradalvand
Copy link

aradalvand commented Feb 20, 2023

System info

Operating System: Ubuntu 22.04
Shaka Packager Version: shaka-packager version v2.6.1-634af65-release

Issue and steps to reproduce the problem

I have an MP4 video on which I've done a -force_key_frames using FFmpeg prior to feeding it to Shaka Packager:

ffmpeg -i input.mp4 -force_key_frames expr:gte(t,n_forced*2) output.mp4

This creates a keyframe every two seconds, which means that when the video is then fed to Shaka Packager, it can create segments that have perfectly consistent durations.
And it works, when I run Shaka Packager on this file and set the --segment_duration to 2, here's the playlist that gets generated for the video stream:

#EXTM3U
#EXT-X-VERSION:6
## Generated with https://github.com/google/shaka-packager version v2.6.1-634af65-release
#EXT-X-TARGETDURATION:3
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="init.m4s"
#EXTINF:2.000,
seg-1.m4s
#EXTINF:2.000,
seg-2.m4s
#EXTINF:2.000,
seg-3.m4s
#EXTINF:2.000,
seg-4.m4s
#EXTINF:2.000,
seg-5.m4s
#EXTINF:2.000,
seg-6.m4s
#EXTINF:2.000,
seg-7.m4s
#EXTINF:2.000,
seg-8.m4s
#EXTINF:2.000,
seg-9.m4s
#EXTINF:2.000,
seg-10.m4s

## Etc.

However, the audio stream segments still don't have equal durations and their durations don't match those of their counterparts in the video stream. I can't figure out why:

#EXTM3U
#EXT-X-VERSION:6
## Generated with https://github.com/google/shaka-packager version v2.6.1-634af65-release
#EXT-X-TARGETDURATION:3
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="init.m4s"
#EXTINF:2.020,
seg-1.m4s
#EXTINF:1.997,
seg-2.m4s
#EXTINF:1.997,
seg-3.m4s
#EXTINF:1.997,
seg-4.m4s
#EXTINF:1.997,
seg-5.m4s
#EXTINF:1.997,
seg-6.m4s
#EXTINF:1.997,
seg-7.m4s
#EXTINF:2.020,
seg-8.m4s
#EXTINF:1.997,
seg-9.m4s
#EXTINF:1.997,
seg-10.m4s

## Etc.

What is the expected result?
The audio segments should be exactly "2.000" seconds in duration.

What happens instead?
The durations differ slightly.

@nunocorreiavargas
Copy link

Hi, the audio sampling rate and audio frame size from the encoder results in an audio frame duration that in sum doesn't fit exactly the 2s segment duration. Try to use in ffmpeg audio sampling rate 48kh and audio frame size samples 480 or multiple.

@aradalvand
Copy link
Author

aradalvand commented Mar 16, 2023

@nunocorreiavargas Hi, thanks, but I tried the following FFmpeg command to extract the audio with the characteristics you mentioned, and then fed it to Shaka Packager, same result, the durations aren't identical:

ffmpeg -i input.mp4 -vn -c:a aac -b:a 192k -ar 48000 -ac 1 -af "aformat=channel_layouts=mono,asetnsamples=n=480" audio.mp4

Unless I'm doing something wrong?

@nunocorreiavargas
Copy link

nunocorreiavargas commented Mar 16, 2023 via email

@misiek08
Copy link
Contributor

@nunocorreiavargas I have similar problem. DASH playlists look great for video streams, but audio has most segments listed, because duration is changing almost every segment.

I've tried ffprobe on output and for the fields I think are most important I get same results.
In manifest:

#EXTINF:6.012,
audio/seg_1631340681.m4s
#EXTINF:5.988,
audio/seg_1631881939.m4s

then analysis by ffprobe:

# cat init.mp4 seg_1631340681.m4s | ffprobe -select_streams a:0 -show_frames - | grep durat | sort | uniq -c
    258 duration=2089
      1 duration=N/A
    258 duration_time=0.023211
      1 duration_time=N/A
    258 pkt_duration=2089
      1 pkt_duration=N/A
    258 pkt_duration_time=0.023211
      1 pkt_duration_time=N/A
# cat init.mp4 seg_1631881939.m4s | ffprobe -select_streams a:0 -show_frames - | grep durat | sort | uniq -c
    257 duration=2089
      1 duration=N/A
    257 duration_time=0.023211
      1 duration_time=N/A
    257 pkt_duration=2089
      1 pkt_duration=N/A
    257 pkt_duration_time=0.023211
      1 pkt_duration_time=N/A

so I have one more packet...
HLS looks like:

#EXTINF:6.012,
audio/seg_1638360306.m4s
#EXTINF:5.988,
audio/seg_1638901563.m4s
#EXTINF:6.012,
audio/seg_1639440731.m4s
#EXTINF:5.988,
audio/seg_1639981988.m4s
#EXTINF:5.988,
audio/seg_1640521155.m4s
#EXTINF:6.012,
audio/seg_1641060322.m4s
#EXTINF:5.988,
audio/seg_1641601579.m4s
#EXTINF:6.012,
audio/seg_1642140747.m4s
#EXTINF:5.988,
audio/seg_1642682004.m4s

and DASH (that is the worst part)

<SegmentTemplate timescale="90000" initialization="audio/init.mp4" media="audio/seg_$Time$.m4s" startNumber="206">
          <SegmentTimeline>
            <S t="1482300584" d="541051"/>
            <S t="1482841840" d="538962"/>
            <S t="1483381009" d="538962"/>
            <S t="1483920176" d="541051"/>
            <S t="1484461433" d="538962"/>
            <S t="1485000599" d="541051"/>
            <S t="1485541857" d="538962"/>
            <S t="1486081025" d="538962"/>
            <S t="1486620192" d="541051"/>

instead of one, shiny <S /> tag with r="..x.." attribute.

My ffmpeg command only chooses codec and audio bitrate, no channels, sampling or other filters. packager has no special flags, just simple path templates and init definition. I'm here to test different options if you are still interested ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants