Add Support for Closed Captions to .MKV container #375

sbshepherd · 2020-04-29T14:15:35Z

I would like to transcode existing .MXF video files to FFV1/.MKV for long-term preservation but the closed caption streams get stripped out because the .MKV container can’t contain them.

The codecs of my .MXF files differ, but the example I’ll use here is DNxHD. It contains six closed caption streams (different languages). Three are EIA-608 and three are EIA-708. These show in MediaInfo as “Text” streams, and they show in FFprobe as a single data stream with a data_type of “vbi_vanc_smpte_436M.” See screenshots below:

If it’s helpful, here is the video codec information for this file:

Ideally, these caption streams will carry over into the new .MKV file and be playable in a standard media player such as VLC. I should be able to turn on/off each language as the video plays.

JeromeMartinez · 2020-04-29T14:48:19Z

and they show in FFprobe as a single data stream with a data_type of “vbi_vanc_smpte_436M.”

FYI "vbi_vanc_smpte_436M" in FFmpeg is called "Ancillary Data" in MediaInfo (608/708 captions are muxed in the "vbi_vanc_smpte_436M" which is muxed in the MXF).

It contains six closed caption streams

For reference: 1 closed caption stream, format is CDP.
We could extract 608 from CDP, but 708 can not be alone (it needs CDP).
Here, I think we should convert CDP to "DTVCC Transport" (transport layer of 708 spec, and same features as CDP). this stream would transport 608 and 708 streams as in ATSC streams ("DTVCC Transport" is the content in AVC or HEVC private element dedicated to captions).

Several steps here:

defining 708 transport layer in a MKV track (extension document)
implementing it in an encoder
implementing it in VLC

sbshepherd · 2020-05-13T13:55:56Z

There is a sample file at the following location if one is needed for testing: https://archive.org/details/xdcam_sample_with_caption_track

robUx4 · 2020-05-24T09:05:36Z

From the point of view of Matroska, do we need another track type for Closed Caption (as the title suggests) ? Or it can be put in subtitle tracks with the proper codec mapping ?

mbunkus · 2020-05-24T10:04:03Z

Personally I'd vote for "keep it subtitle, let CodecID speak for itself". We already integrate to many different formats under the type "subtitle", some text based, others are images. And closed captions fulfill largely the same role.

I don't consider their traditional way of transportation (embedded in the video track) to be relevant for our decision.

JeromeMartinez · 2020-05-26T08:23:35Z

I don't consider their traditional way of transportation (embedded in the video track) to be relevant for our decision.

The debate open/closed caption vs subtitle is not based on their traditional way of transportation, it is about the nature of the content, see for example the description of both in HTML or a long explanation about the "difference".

but IMO we could keep "S_" prefix as in practice there is so little difference.

mbunkus · 2020-05-26T08:54:44Z

Well, what our track type "subtitles" transports can easily fill both roles, and it only depends on the content. Similar to how "audio" tracks can contain the whole dialog or only the director's comments. I don't see any reason to use a separate track type.

JeromeMartinez · 2020-05-26T08:55:56Z

I don't see any reason to use a separate track type.

No worry :), we are in sync here! (keeping "S_" prefix)

mbunkus · 2020-05-26T08:56:16Z

It would definitely be good to have a type indicator orthogonal to the track type. We've talked about such a track header field several times already.

dericed · 2020-05-27T00:02:07Z

Is it worthwhile to support these as block additional mappings? To store the raw bytes along their corresponding frame.

mbunkus · 2020-05-27T06:52:00Z

What would the advantages be?

I'm pretty much against that. CCs are something you can turn on or off, they're something that has meta data such as e.g. a track name and a language, they're basically something that acts like a track. So let's make it a (separate) track, not be somehow part of another track.

MikeChenMM · 2020-06-20T10:48:38Z

Hi, all, another instance of "I did it unofficially again" here.
MakeMKV transcodes closed captions from DVD to UTF-8 SRT. To do so, internally, it first extracts 608 data into a separate track "S_CC608/DVD" and then internally converts this track into "S_TEXT/UTF8". By default, "S_CC608/DVD" track is never written to actual MKV file. In theory one can change conversion profile to get raw S_CC608/DVD stream instead of converted copy.
I personally see very little benefit in supporting raw 608 streams - they are overwhelmingly generated from text subtitles, can be automatically converted to text subtitles and are essentially text subtitles. The code to convert 608/708 to text is GPL. Why bother?...

sbshepherd · 2020-06-22T13:54:10Z

This might be a silly question, but I wonder how the various options will affect potential broadcast of the content. If the broadcasting arm of our organization wants to use the .MKV with closed captions, will that be do-able under any of these scenarios? I understand the similarities between closed captions and subtitles are minimal in practice, but it occurs to me that the use cases for these files may not be solely web-based. They should be preservation worthy, meaning I shouldn't lose functionality that was already in the original file. I can hear someone arguing "then why not keep it .MXF?" Answer: because .MXF is the problem we're trying to solve.

I don't know a lot about broadcasting, so maybe the idea of broadcasting an .MKV won't work regardless. If that's the case, then the file (with captions) would need to be capable of transcoding to a format that can maintain the captions for broadcast.

Am I asking too much? :)

robUx4 · 2022-05-22T06:53:42Z

We have the new track flags since #447, namely FlagHearingImpaired, FlagVisualImpaired and FlagTextDescriptions.

Also the subtitle track is defined as

Subtitle or closed caption data to be rendered over the video track(s).

So apart from the actual 608 and 708 codec definitions, do we need anything else ?

dhouck · 2023-03-31T10:11:16Z

I personally see very little benefit in supporting raw 608 streams - they are overwhelmingly generated from text subtitles, can be automatically converted to text subtitles and are essentially text subtitles. The code to convert 608/708 to text is GPL. Why bother?...

Which code are you talking about? Iʼve seen multiple programs that convert closed captions to other subtitle formats, but they usually lose relevant information in at least some cases. Since people want to use Matroska for archival, there should be the option of storing the original data instead of lossily converting it.

One potential difficulty I see is that one closed captioning stream can have multiple logical tracks (for example, one language in CC1 and another in CC2), and Iʼm not sure if thereʼs any good way to handle that.

JeromeMartinez · 2023-03-31T10:23:25Z

but they usually lose relevant information in at least some cases.

I add some example to this conversion about lack losslessness: metadata is lost (program name, content advisory, network name, weather info, etc), exact timing of events may be lost too especially in RollUp mode, positioning/colors may also be lost (most converter don't care about that and all is in a .srt without positioning), and reversibility to CEA-608/708 is a lot more difficult (and we have to code it, AFAIK there is no such code yet).

dericed · 2023-03-31T13:35:53Z

I reread this discussion and IIUC there seems to be rough consensus in:

storing 608/708 caption data as a subtitle track rather than side data on the video frames
storing 608/708 as-is rather than transforming to a more common subtitle format (such conversions are often lossy)

I agree with @sbshepherd's concern that a 608/mxf to 608/mkv conversion may have some loss in functionality, particularly in broadcast settings, but this is a chicken/egg issue. No one can add support for 608/mkv broadcast functionality until we have the specification written and ideally some sample files freely published.

In planning to store 608/708 data, I suggest also considering muxing in scc files as an input.

I'm curious to know more about @MikeChenMM's internally-defined S_CC608/DVD.

Perhaps we could document a few scenarios:
S_CC608 where the block stores the two octets of caption data. This is similar to how 608 captions are written into the VAUX header of DV or the c608 track of QuickTime.
S_CCS436M which stores the SMPTE 436M values.

Though in each of these scearnios would we need a CodecPrivate definition?

JeromeMartinez · 2023-03-31T13:51:18Z

Perhaps we could document a few scenarios

Also S_CC708 for extracting c708 from Ancillary data (or MOV).

Though in each of these scearnios would we need a CodecPrivate definition?

They are both "streaming" formats so don't require any configuration data.

MikeChenMM · 2023-03-31T16:43:07Z

On 31/03/2023 14:36, Dave Rice wrote: I reread this discussion and IIUC there seems to be rough consensus in: * storing 608/708 caption data as a subtitle track rather than side data on the video frames * storing 608/708 as-is rather than transforming to a more common subtitle format (such conversions are often lossy) I agree with @sbshepherd <https://github.com/sbshepherd>'s concern that a 608/mxf to 608/mkv conversion may have some loss in functionality, particularly in broadcast settings, but this is a chicken/egg issue. No one can add support for 608/mkv broadcast functionality until we have the specification written and ideally some sample files freely published. In planning to store 608/708 data, I suggest also considering muxing in scc files as an input. I'm curious to know more about @MikeChenMM <https://github.com/MikeChenMM>'s internally-defined |S_CC608/DVD|.

MakeMKV has a notion of conversion profiles, which are xml files defining rules to create conversion graph. Internally, CC are extracted into its own track (S_CC608/DVD) but default conversion profile always attaches CC->SRT filter on top, so by default CC are saved as SRT. However, one can make a custom conversion profile, and force MakeMKV to save raw CC as a separate track. The data format is trivial - just CC bytes with timecodes from related frame. If you want to play with that, attached is a conversion profile that saves CC both as raw CC and SRT file. This file has to be dropped into "MakeMKV data directory", which can be looked up in preferences.

…

-- Thanks, Mike.

dhouck · 2023-04-04T08:53:39Z

Iʼd like to expand on the worry I mentioned before. Consider the attached SCC file (zipped so it can be uploaded to GitHub), which I made as an example for this YouTube video¹, which is a song with Italian lyrics but an official English translation. The one SCC file has data for both, although most software Iʼve used can only access the Italian track. (This technique of multiple languages in different channels is common in certain situations, although I created the specific file myself as an example for this bug report).

Ideally, this one stream of byte pairs would decompose into two subtitle tracks (one Italian, the other English); I donʼt know if Matroska currently supports tracks sharing data like that but if not some other solution would need to be found, and I imagine requiring the muxer to figure out which track each byte pair relates to is not the best answer.

I donʼt know much about 708, but I think it exacerbates this issue by being able to carry more data in whatʼs still logically the same stream.

SognoDiVolareStayAtHomeChoir.zip

Note that the video, and the SCC file that goes with it, are 25 FPS; some tools assume 29.97 FPS for SCC files but given the relationship between the caption data and the video frames it makes sense to match the frame rate. ↩

robUx4 · 2023-10-08T12:10:14Z

I donʼt know if Matroska currently supports tracks sharing data like that

No. And IMO it's not a feature we would want. If you remux the file and only want to keep the English version, you would need to know which part needs to go with what (or it could be hidden from the user). It would also be tricky to implement for players. They may have closed caption support and splitting the tracks accordingly. But when stored in Matroska each language should have its own track.

That means to mux properly these 2 tracks you need to be able to parse the 708 data to generate two tracks that each keep one language. That probably means having a 708 "encoder" as well, if you have to use some 708 features. This is tricky, but not trickier than having to handle tracks which content depend on another track in all players.

I think this use case is specific to 708 (and 608?), a modern format would not mix languages like that in a binary format.

robUx4 added the codec mapping label May 24, 2020

robUx4 added the spec_codecs Codec Matroska spec document target label Mar 14, 2021

robUx4 mentioned this issue Apr 13, 2024

add c608 mapping #823

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for Closed Captions to .MKV container #375

Add Support for Closed Captions to .MKV container #375

sbshepherd commented Apr 29, 2020 •

edited

JeromeMartinez commented Apr 29, 2020

sbshepherd commented May 13, 2020

robUx4 commented May 24, 2020

mbunkus commented May 24, 2020

JeromeMartinez commented May 26, 2020

mbunkus commented May 26, 2020

JeromeMartinez commented May 26, 2020

mbunkus commented May 26, 2020

dericed commented May 27, 2020

mbunkus commented May 27, 2020

MikeChenMM commented Jun 20, 2020

sbshepherd commented Jun 22, 2020

robUx4 commented May 22, 2022

dhouck commented Mar 31, 2023

JeromeMartinez commented Mar 31, 2023

dericed commented Mar 31, 2023

JeromeMartinez commented Mar 31, 2023

MikeChenMM commented Mar 31, 2023 via email

dhouck commented Apr 4, 2023

robUx4 commented Oct 8, 2023

Add Support for Closed Captions to .MKV container #375

Add Support for Closed Captions to .MKV container #375

Comments

sbshepherd commented Apr 29, 2020 • edited

JeromeMartinez commented Apr 29, 2020

sbshepherd commented May 13, 2020

robUx4 commented May 24, 2020

mbunkus commented May 24, 2020

JeromeMartinez commented May 26, 2020

mbunkus commented May 26, 2020

JeromeMartinez commented May 26, 2020

mbunkus commented May 26, 2020

dericed commented May 27, 2020

mbunkus commented May 27, 2020

MikeChenMM commented Jun 20, 2020

sbshepherd commented Jun 22, 2020

robUx4 commented May 22, 2022

dhouck commented Mar 31, 2023

JeromeMartinez commented Mar 31, 2023

dericed commented Mar 31, 2023

JeromeMartinez commented Mar 31, 2023

MikeChenMM commented Mar 31, 2023 via email

dhouck commented Apr 4, 2023

Footnotes

robUx4 commented Oct 8, 2023

sbshepherd commented Apr 29, 2020 •

edited