Voice Activity Detection (VAD) AudioWorklet
-
Updated
Jun 10, 2024 - JavaScript
Voice Activity Detection (VAD) AudioWorklet
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
Voice activity detection and speaker gender segmentation audiovisual corpus
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
silero-vad + whisper.cpp (speech-to-text) for ROS 2
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A comprehensive AI companion leveraging advanced semantic analysis, sentiment detection, and voice processing to provide personalized and context-aware interactions using Autogen, semantic-router, and VoiceProcessingToolkit.
A python package to build AI-powered real-time audio applications
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
Introduction to Speech Processing
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
Real-time microphone noise suppression on Linux.
A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS"
On-device voice activity detection (VAD) powered by deep learning
Add a description, image, and links to the voice-activity-detection topic page so that developers can more easily learn about it.
To associate your repository with the voice-activity-detection topic, visit your repo's landing page and select "manage topics."