Voice-Vertification 😃

Introduction

Feature Extractor

STFT

Mel Spec

MFCC

GMM and GMM-UBM

JFA

I-vector

D-vector

X-vector

Wav2vec

Backend model

VggVox

Attention Backend

Loss Functions

Contrastive Loss

Triplet Loss

GE2E loss

Dataset

Data can be found at link
You should follow the data directory as in ./data

Usage

You can follow colab files in /notebooks (not final yet LOL) for quick end2end implementation.
If you wanna make it complicated, just look through the code in src and try step by step (make sure you're in the right folder before run commands):
- Looking through the data python3 utils.py
- For preparing dataset brefore training: python3 build_data.py --data_root --training_pairs --max_wav_len
- For training python3 train.py --n_mfcc --sample_rate --batch_size 64 --epoch_n --lin_neurons
- For testing: python3 predict.py --limit

References

Lots of useful tutroials Youtube Channel
Paper GMM pdf
Paper Adaptive GMM pdf
Paper JFA pdf
Paper I vector pdf
Paper D vector pdf
Paper X vector pdf
Paper attention backend with x vector pdf
Wav2vec pdf
Ge2e Loss Paper pdf
Vggvox Paper pdf

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

notebooks

notebooks

src

src

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Voice-Vertification 😃

Introduction

Feature Extractor

STFT

Mel Spec

MFCC

GMM and GMM-UBM

JFA

I-vector

D-vector

X-vector

Wav2vec

Backend model

VggVox

Attention Backend

Loss Functions

Contrastive Loss

Triplet Loss

GE2E loss

Dataset

Usage

References

About

Releases

Packages

Languages

manhph2211/Speech-Processing

Folders and files

Latest commit

History

Repository files navigation

Voice-Vertification 😃

Introduction

Feature Extractor

STFT

Mel Spec

MFCC

GMM and GMM-UBM

JFA

I-vector

D-vector

X-vector

Wav2vec

Backend model

VggVox

Attention Backend

Loss Functions

Contrastive Loss

Triplet Loss

GE2E loss

Dataset

Usage

References

About

Topics

Resources

Stars

Watchers

Forks

Languages