Skip to content

SRT parser that can handle wrong SRT format too (like 00:00:12.682 use dot as separator, which is wrong, it should be a comma)

License

Notifications You must be signed in to change notification settings

clowdr-app/srt-parser-2

 
 

Repository files navigation

srt-parser-2

This is a SRT parser for Javascript
It read .srt file into an array

Install

npm

npm install srt-parser-2

or yarn

yarn add srt-parser-2

Example

This is a srt format file:

1
00:00:11,544 --> 00:00:12,682
Hello

it would become:

[{
    id: '1',
    startTime: '00:00:11,544',
    endTime: '00:00:12,682',
    text: 'Hello' 
}]

Enviroment support

Since it only process text,
it should work in both Browser and Node.js enviroment

Usage

var { default: srtParser2 } = require("srt-parser-2")

var parser = new srtParser2()
var srt = `
1
00:00:11,544 --> 00:00:12,682
Hello
`
var result = parser.fromSrt(srt);
console.log(result);

CLI

npx srt-parser-2 -i input.srt -o output.json --minify

Options:

Option Required Default
--input or -i Yes
--output or -o No output.json
--minify No false

License

MIT

Why?

Why this one special? There are plently SRT parser on npm like:

What's wrong with them?

Nothing wrong.
All of them can handle this format:

1
00:00:11,544 --> 00:00:12,682
Hello

But I want to handle format like these:

00:00:11.544

This is wrong format, it use period as separator

Or this:

00:00:11,5440

This is also wrong format, millisecond has 4 digit

Or this:

1:00:11,5

Similiar, hour & millisecond is only 1 digit (wrong)

Or this

00:00:00.05

etc

Format Support

Format Other parser srt-parser-2 srt-parser-2 would turn this into
00:00:01,544 Yes ✅ Yes ✅ 00:00:01,544
00:00:01.544 ❓ Yes for some of them Yes ✅ 00:00:01,544
00:00:01.54 ❓ Yes for some of them Yes ✅ 00:00:01,544
00:00:00.3333 No ❌ Yes ✅ 00:00:00,333
00:00:00.3 No ❌ Yes ✅ 00:00:00,300
1:2:3.4 No ❌ Yes ✅ 01:02:03,400

Basic principle:

  1. If hour,minute,second is shorter than 2 digit, pad start with "0", if longer than 2 digit, only save first 2 digit.
  2. Millisecond is the same, but it's 3 digit.
  3. Seperator can be .(periods) or ,(comma), periods(incorrect) will be replace with comma(correct)

Conclusion

  1. Support more time format (even wrong format)
  2. Have extensive test

Why I write this?

I am writing Tern - Subtitle File Translator

Some of the user says they have trouble translate some of the .srt file

And I found out these .srt file have format like 00:00:01.544 and 00:00:00.05

that's why I write this

About

SRT parser that can handle wrong SRT format too (like 00:00:12.682 use dot as separator, which is wrong, it should be a comma)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 86.5%
  • JavaScript 13.5%