speech-to-text-js

Transcribe audio files in Node.js using IBM Watson Speech to Text API

Following the steps below allows you to load .ogg files into the audio folder, run the program and have transcribed .txt files deposited into the text folder.

This repository was created to transcribe audio files, adding speaker labels and timestamps. Currently, Watson's speech to text speaker labels function is in a beta mode. Speaker labels are only returned as a collection, and not returned attached to the transcribed text. This is a problem if one wants to transcribe audio with more than one speaker and have the text identify when a different speaker is speaking.

To solve this problem, take a look in localModules/speakerLabelsAlgorithm.js. In this file, the JSON returned from Watson's recognizeStream is pulled apart and put back together as a string. This string is then written to the writeStream. Please email brendan.ohandley@gmail.com if you have any thoughts, advice or questions about this algorithm.

To run the code locally follow the instructions listed below.

Install Node.js
Install Node Package Manager
Install Modules

    npm install

Get IBM Watson credentials
- Get your credentials here
Create a .gitignore file and in the file write .env
Create a dotenv file

    touch .env

Store your Watson credential information

   WATSON_USERNAME=<your-watson-username-as-a-string>
   WATSON_PASSWORD=<your-watson-password-as-a-string>

Place .ogg files in the 'audio' folder.

    speech-to-text-js/audio

Run the program

    node transcribeAudio/speechToText.js

Transcribed .txt can be found in the text directory

    speech-to-text-js/text

Look at your lovely transcribed file with speaker labels and timestamps.

    00:00:01 Speaker - 0: Hello

    00:00:08 Speaker - 1: Hi there

    00:00:17 Speaker - 0: Glad you could make it 

    00:00:26 Speaker - 1: Me too

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
acousticModel		acousticModel
audio		audio
languageModel		languageModel
localModules		localModules
node_modules		node_modules
text		text
transcribeAudio		transcribeAudio
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

acousticModel

acousticModel

audio

audio

languageModel

languageModel

localModules

localModules

node_modules

node_modules

text

text

transcribeAudio

transcribeAudio

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

speech-to-text-js

About

Releases

Packages

Languages

License

bohandley/speech-to-text-js

Folders and files

Latest commit

History

Repository files navigation

speech-to-text-js

About

Resources

License

Stars

Watchers

Forks

Languages