Voice2Text

Voice2Text is transcript media file to txt file to use Google Speach API &

Installation

Voice2Text need GOOGLE_APPLICATION_CREDENTIALS files. if you don't have this, please build google cloud projects and get from it.

Gcloud Project build

Google Cloud SDK Install

brew cask install google-cloud-sdk

Setting Gcloud Projects

gcloud auth login
gcloud alpha projects create voicetotext-123456 --name voice2text

Go to Projects URL and enable Google Speech API.
Please Enable (Billing)[https://support.google.com/cloud/answer/6293499?hl=en].
Create Service Key and Downlaod (Ref:Service Acount.)
set GOOGLE_APPLICATION_CREDENTIALS

export GOOGLE_APPLICATION_CREDENTIALS='/your/service/acount/key/xxx.json'

Install

pip install voicetotext

Usage

This application has two commands. splitvoice is convert the voice diving. voicetotext is voice existing in the folder into a text through google api. (See help command)

splitvoice --help
voicetotext --help

Sample

Split Audio Files

Sample Japanese voices from here

$ splitvoice voices/hana_1.mp3 --relative
spliting /57
spliting Done!
File was separete 57 filesOutput Separeted files? [Y/n]:y
separeted done! Have a nice Day!⏎

Transript Japanese audio files

$ voicetotext results/ -s 22050 -l "ja_JP"
芥川龍之介
花
line
朗読池田秀雄
禅智内供の鼻といえば池で知らないものはない
長澤語録すがって上唇の上から顎の下まで下がっている

Error Handling

"Sample rate in request does not match FLAC header."

You need to examine the sample rate. I recommend ffprove to examine.

$ ffmprove results/000.flac
Input #0, flac, from 'results/000.flac':
  Metadata:
    ENCODER         : Lavf57.56.101
  Duration: 00:00:01.87, start: 0.000000, bitrate: 184 kb/s
    Stream #0:0: Audio: flac, 22050 Hz, mono, s16

You can get framerate. In this case, frame rate is 22050. So, your commands is this.

$ voicetotext results -s 22050

Contributing

Fork it!
Create your feature branch: git checkout -b my-new-feature
Commit your changes: git commit -am 'Add some feature'
Push to the branch: git push origin my-new-feature
Submit a pull request :D

Debugging

# virtualenv
python3 -m venv env
source ./env/bin/activate

# python packages install
pip install -r requirements.txt

History

License

This software is released under the MIT License, see LICENSE.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
gif		gif
voicetotext		voicetotext
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Readme.md		Readme.md
requirements.txt		requirements.txt
result.txt		result.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gif

gif

voicetotext

voicetotext

.gitignore

.gitignore

LICENSE

LICENSE

MANIFEST.in

MANIFEST.in

Readme.md

Readme.md

requirements.txt

requirements.txt

result.txt

result.txt

setup.py

setup.py

Repository files navigation

Voice2Text

Installation

Gcloud Project build

Install

Usage

Sample

Split Audio Files

Transript Japanese audio files

Error Handling

"Sample rate in request does not match FLAC header."

Contributing

Debugging

History

License

About

Releases

Packages

Languages

License

mkazutaka/voicetotext

Folders and files

Latest commit

History

Repository files navigation

Voice2Text

Installation

Gcloud Project build

Install

Usage

Sample

Split Audio Files

Transript Japanese audio files

Error Handling

"Sample rate in request does not match FLAC header."

Contributing

Debugging

History

License

About

Resources

License

Stars

Watchers

Forks

Languages