Skip to content

Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.

License

Notifications You must be signed in to change notification settings

ragibson/MFCC-speech-recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MFCC-speech-recognition

This repository contains an easy-to-train machine learning architecture that can recognize speech commands on low-end, commodity hardware in real-time.

Specifically, the architecture uses "Mel-frequency cepstral coefficients" as input features to a small neural network, achieving "near state-of-the-art" classification accuracy.

Importantly, this implementation has an inference time of ~10 microseconds on a desktop CPU for 0.1 s of input sound. In other words, it could run in real-time on systems up to 10,000x slower than our desktop CPU.

A more comprehensive description of the architecture and its performance can be read here.

This project was originally hosted here.

About

Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published