Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve interleaving of processed frames #236

Open
IanDBird opened this issue Jan 5, 2023 · 0 comments
Open

Improve interleaving of processed frames #236

IanDBird opened this issue Jan 5, 2023 · 0 comments

Comments

@IanDBird
Copy link
Contributor

IanDBird commented Jan 5, 2023

Problem
From my reading of the code, and logging the PTS values of samples fed into the MediaTarget, it appears that samples are fetch from each track sequentially (round robin). This results in a stream similar to "A/V/A/V/A/V... etc". However, this doesn't take into account that the duration of each frame (aka sample) is likely to be very different. For a video that has a frame rate of 30fps, a single video frame will likely be presented while multiple audio frames are rendered via the AudioTrack.

This behaviour results in the audio and video samples that should be presented at the same time being sparsely spread throughout the output file. The default Android muxer (or other open source muxers) will be able to fix the interleaving. For small files, this is likely to not be that noticeable, but would require those players to seek back and forth within the file. The issue will become more noticeable as the output file grows (it's worth noting that ExoPlayer has different buffering logic per track compared to other video players). If we wanted to support a segmented output file, this behaviour is also problematic. We need each segment to contain the same duration for every contained stream.

Proposed Solution
One possible solution could be:

  • Modify TrackTranscoder to report the PTS of the last frame processed
  • Modify TransformationJob.processNextFrame to be smarter about which TrackTranscoder to process next. We could define a max duration that streams are allowed to written ahead. If we observe that a stream (e.g. video) is now beyond that, we will continue to process the Audio track until it's caught up.

I took a look at Google's Media3 / Transformer and looks like they're doing something very similar: https://github.com/androidx/media/blob/main/libraries/transformer/src/main/java/androidx/media3/transformer/MuxerWrapper.java#L57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant