MFCC-speech-recognition

This repository contains an easy-to-train machine learning architecture that can recognize speech commands on low-end, commodity hardware in real-time.

Specifically, the architecture uses "Mel-frequency cepstral coefficients" as input features to a small neural network, achieving "near state-of-the-art" classification accuracy.

Importantly, this implementation has an inference time of ~10 microseconds on a desktop CPU for 0.1 s of input sound. In other words, it could run in real-time on systems up to 10,000x slower than our desktop CPU.

A more comprehensive description of the architecture and its performance can be read here.

This project was originally hosted here.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
report		report
.gitignore		.gitignore
DeepHark.ipynb		DeepHark.ipynb
LICENSE		LICENSE
README.md		README.md
network_state		network_state
pytorch_tutorial.ipynb		pytorch_tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

report

report

.gitignore

.gitignore

DeepHark.ipynb

DeepHark.ipynb

LICENSE

LICENSE

README.md

README.md

network_state

network_state

pytorch_tutorial.ipynb

pytorch_tutorial.ipynb

Repository files navigation

MFCC-speech-recognition

About

Releases

Packages

Languages

License

ragibson/MFCC-speech-recognition

Folders and files

Latest commit

History

Repository files navigation

MFCC-speech-recognition

About

Topics

Resources

License

Stars

Watchers

Forks

Languages