Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCExtractor and detecting Closed Caption language #11

Open
rlaphoenix opened this issue Feb 6, 2023 · 0 comments
Open

CCExtractor and detecting Closed Caption language #11

rlaphoenix opened this issue Feb 6, 2023 · 0 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@rlaphoenix
Copy link
Member

Describe the bug
Let's imagine there's a title that has 4 subs as WebVTT. German, French, Italian, and Spanish. But the title is English and the only English sub is as a C608 box.

The current code does not run CCExtractor unless there are NO other subtitles available, therefore this would be missing English subtitles. However, we also cannot assume the C608 is English. Therefore, if the C608 was German and we extract it as there's no English sub, we would still be missing English subtitles but now have a duplicate German subtitle.

C608 boxes extracted via CCExtractor are currently missing language information (unless I'm not looking hard enough). We need to detect the language to be able to proceed with this effectively.

Expected behavior
CCExtractor should run if there are no subtitles in the title's original language. For example, if there are no English subtitles on an English video of The Sopranos, then it should run CCExtractor to check for potential English C608 boxes. It should also check the C608 boxes language and ensure that it is English otherwise only use it if there is no other Subtitle in that language.

Another option would be to have some way to detect what language a subtitle is by analyzing the text content. If we can do that, then we could just check if we need a sub for that detected language, if so take it.

@rlaphoenix rlaphoenix added the bug Something isn't working label Feb 6, 2023
@rlaphoenix rlaphoenix changed the title CCExtractor should run if there is no subtitles in the title's Original Language CCExtractor and detecting Closed Caption language Feb 6, 2023
@rlaphoenix rlaphoenix added the help wanted Extra attention is needed label Mar 8, 2024
@rlaphoenix rlaphoenix pinned this issue Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant