dataloaders

Pytorch and TFRecords data loaders for several audio datasets

Datasets

ESC - dataset of environmental sounds

LibriSpeech - corpus of read English speech

NSynth - dataset of annotated musical notes

VoxCeleb2 - human speech, extracted from YouTube interview videos

Pytorch loader
TFRecords loader

GTZAN - audio tracks from a variety of sources annotated with genre class

CallCenter - audio tracks with human and non-human speech

PyTorch DataSet

For validation we frequently use the following scheme:

Read 10 random crops from a file;
Predict a class for each crop;
Averaging results.

For this scheme we've done additional DataLoaders for PyTorch:

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
callcenter		callcenter
esc		esc
examples		examples
gtzan		gtzan
librispeech		librispeech
misc		misc
nsynth		nsynth
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

callcenter

callcenter

esc

esc

examples

examples

gtzan

gtzan

librispeech

librispeech

misc

misc

nsynth

nsynth

.gitignore

.gitignore

README.md

README.md

Repository files navigation

dataloaders

About

Releases

Packages

Contributors 3

Languages

juliagusak/dataloaders

Folders and files

Latest commit

History

Repository files navigation

dataloaders

About

Topics

Resources

Stars

Watchers

Forks

Languages