-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a note about MIME types of font attachments #518
Comments
There are following 10 types: (1) These mime types are officially registered, and MUST be supported:
(2) This is a valid generic type, and MAY be supported if the player checks the file extenstion.
(3) These two have been the de facto standard types, and SHOULD (or MUST) be supported:
(4) These are rare types which might be sometimes used in the wild; MAY be supported:
*1 “Note that "font/sfnt" is an abstract type from which the (widely used in practice) "font/ttf" and "font/otf" types are conceptually derived. Use of "font/sfnt" is likely to be rare in practice, and might be confined to: Uncommon combinations such as "font/sfnt; layout=sil" that do not have a shorter type” [RFC8081] *2 DEPRECATED in favor of font/sfnt [IANA] but still valid. -- “Contrary to the expectations of the W3C WebFonts WG, which developed Web Open Font Format (WOFF), the officially defined media types such as "application/font-woff" and "application/font-sfnt" see a very limited use” [RFC8081]. *3 An x- type, application/x-truetype-font, is technically valid [RFC2045]: “A media type value beginning with the characters "X-" is a private value, to be used by consenting systems by mutual agreement.” [RFC2046]. Haali started using it in his experimental patch to Gabest's code; an x- type exists exactly for such a situation. One should not misunderstand that application/x-truetype-font is something wrong, something non-standard. It was, and still is, a perfectly valid MIME type, although of course its usage is now discouraged, as the official type for TTF has been registered. |
Can you provide some advice for encoders? Are there any common players that do not support font/ttf and font/otf? Point (2) concerns me. Implementations can check the extension, but they SHOULD NOT derive the type by examining the file contents. This is what Windows Outlook still does (27 years later) and the result is endless trojans in email that evade scanning, as the generic open calls launches some word macro in a file that didn't presented itself as an image. |
I'd say, encoders should keep using the legacy types for the time being. Using font/ttf, you may feel happy knowing that you're rigorously following the standard, and in the long run, eventually we should do things in the standardized way. But currently, there are still some Windows users using players that do not fully support font/ttf (anything before October 2019 - MPC-BE 1.5.4 and before; MPC-HC 1.8.8 and before; LAV Filters 0.74.1 and before). Linux/Mac users should be okay, though. MPC-HC is widely used, but if a user downloads its "latest" version from the official site or from sourceforge.net, they'll get an older version. Because of this, styled subtitles may break for significantly many Windows users (maybe 10–20% of them?) when font/ttf is used. Unfortunately, the current version of MKVToolnix (v58.0.0) does write font/sfnt by default for TTF. MKVToolnix also wrote application/font-sfnt at least on Mac in the past. So it's too late. It seems that libmagic is responsible for these unusual (although technically valid) MIME types. Re: Point (2). By checking the file extension, some versions of LAV Filters & MPC-HC were able to load the font/ttf files correctly, even when that mime type was not explicitly supported by them. That was actually helpful for end users (except the implementation was slightly ad hoc, where .ttf was loaded but .TTF was not). But I do understand your concern. First off, loading a file just because it says font/ttf is already dangerous. One can create a malicious MKV file, where an abnormally huge font file is attached (or so it claims). A naive player might try to read beyond EOF, or at least something unpleasant may occur. So a note about security considerations may be a good idea. On the other hand, there are legal CJK fonts larger than 20 MB (e.g. Microsoft YaHei). |
About point (2), the concern is not file extensions, that's just meta-data, equivalent to the MIME type. So font/sfnt is among the MUST, so I guess it has to remain acceptable to write it. |
Saying that a muxer MUST NOT use an x- type anymore would be too harsh. It MAY use it for backward compatibility and/or interoperability, given that the standard explicitly guarantees that an x- type is freely usable as long as there is a mutual agreement (between the writing app. and player, in this case). Afaik all existing players recognize the legacy types, so I'd say there is a mutual agreement. Legacy types should be phased out eventually, but the change should be gradual so that no one will be upset. On Windows, when a player "loads" a font, it doesn't start any application; it just calls a function like AddFontResourceEx, which simply fails if the data is not a valid font. The attached font can be installed privately, not visible from other processes. This is quite different from an attachment to email, where a random application may start automatically if you click an icon. Let's say, hypothetically the player can handle TTF but can not handle TTC, and let's say the mime type is ambiguous or not reliable. So the player wants to know if it's TTF or TTC. One quick way is to check the extension and see if it's .ttc or not. Another way is to read the first 4 bytes of the attached data and see if it's 'ttcf' or not. These two are not so different. It's not like reading the first 4 bytes is intrinsically more dangerous. That said, you're right, an attached file should be treated carefully: an attacker may be able to create a malicious MKV to exploit a font-related security hole of a specific (poorly designed) player, though such an attack vector seems not very likely. |
I don't think the normative notion of MUST/SHOULD/MUST NOT/etc applies to fonts. It would be like saying every Matroska implementation MUST support h264 and mp3 codecs. It's up to each player to decide what they want to support. Fonts described here are even "extensions" of a subtitle codec. So if the player doesn't support these Subtitle codec there's no reason to force it to support any of these MIME types. IMO it's up to each subtitle codec to define what font format they want the player to support. In other words, it should go in the codec document, not the "main" Matroska spec. |
You're right. As the first post says, “a player that supports embedded fonts for subtitles” should be careful about the backward compatibility, is all. It's NOT like every player should support embedded fonts. If players support fonts, then they are strongly recommended to support legacy MIME-types too. It's reasonable, isn't it? The docs coming with the latest MKVToolnix still use legacy MIME-types too. |
and what a writer should do According to the findings from #518
I updated #115 to include the MIME types a player can expect and what a writer should use (new MIME, unless playback with old players is important). In the end we can't just rely on the codec spec to tell how font attachments have to be used. There are too many fine details to deal with and they are unrelated to the codec itself. The use of font attachments remains entirely optional, but it means the subtitle rendering might be incorrect. (I'm not sure VLC supports them, although it does read them) |
Exactly. It's important for a soft-subber to realize that font support is optional and one can't reliably control softsub rendering. It's like CSS + browser. Also, it's true that subtitles may not be even readable when the attached font is not loaded (e.g. when it's a minority language whose alphabet is not supported by OS). The only surefire way to avoid this is hardsubbing. VLC supports embedded fonts almost perfectly. I didn't check the source code, but if I'm guessing right, although it doesn't explicitly support |
I found the relevant code in VLC. The MIME type is only used to check whether the attachment is a font. It's up to freetype's |
Here in libass (in VLC)
|
Yeah, it should be more consistant with the freetype one. Also the "extension to MIME type" conversion should be done in the matroska demuxer. The subtitle/text renderer should only have to deal with MIME types. |
and what a writer should do According to the findings from #518
Please check my comment on dc90789 |
and what a writer should do According to the findings from #518
Not sure why I tagged this as a codec thing... Anyway |
Because now it's up to each codec that use fonts to so mention it in their codec definition. |
As explained over on doom9's forum attached fonts in Matroska have long used a legacy MIME type as official MIME types for fonts haven't been available for a very long time. Liisachan requested that we add a note about which MIME types to use for new files & which MIME types players should support in order to be able to play older files as well.
Here's their suggestion for a starting point for that note:
This cannot be used as is, of course. I'll whip up a PR soon(ish).
The text was updated successfully, but these errors were encountered: