CapGen

A fast cross-platform CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces. A pip installable offline CLI tool with CUDA support is provided. By default, Voice Activity Detection (VAD) preprocessing is always enabled.

Requirements

Python 3.11
4 GB RAM

Usage (API)

Simply cURL the endpoint like in the following. Currently, the only available caption format is srt and vtt.

curl "https://winstxnhdw-CapGen.hf.space/api/v1/transcribe?caption_format=$CAPTION_FORMAT" \
  -F "request=@$AUDIO_FILE_PATH"

You can also redirect the output to a file.

  curl "https://winstxnhdw-CapGen.hf.space/api/v1/transcribe" \
    -F "request=@$AUDIO_FILE_PATH" | jq -r ".result" > result.srt

Usage (CLI)

CapGen is available as a CLI tool with CUDA support. You can install it with pip.

pip install git+https://github.com/winstxnhdw/CapGen

You may also install CapGen with the necessary CUDA binaries.

pip install "capgen[cuda] @ git+https://github.com/winstxnhdw/CapGen"

Now, you can run the CLI tool with the following command.

capgen -c srt -o ./result.srt --cuda < ~/Downloads/audio.mp3

usage: capgen [-h] [-g] [-t] [-w] -c  -o  [file]

transcribe a compatible audio/video file into a chosen caption file format

positional arguments:
  file            the file path to a compatible audio/video

options:
  -h, --help      show this help message and exit
  -g, --cuda      whether to use CUDA for inference

cpu:
  -t, --threads   the number of CPU threads
  -w, --workers   the number of CPU workers

required:
  -c, --caption   the chosen caption file format
  -o, --output    the output file path

Development

You can install the required dependencies for your editor with the following.

poetry install

You can spin the server up locally with the following. You can access the Swagger UI at localhost:7860/api/docs.

docker build -f Dockerfile.build -t capgen .
docker run --rm -e APP_PORT=7860 -p 7860:7860 capgen

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
.github		.github
capgen		capgen
server		server
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Caddyfile		Caddyfile
Dockerfile		Dockerfile
Dockerfile.build		Dockerfile.build
README.md		README.md
gunicorn.conf.py		gunicorn.conf.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
supervisord.conf		supervisord.conf

winstxnhdw/CapGen

Folders and files

Latest commit

History

Repository files navigation

CapGen

Requirements

Usage (API)

Usage (CLI)

Development

About

Topics

Resources

Stars

Watchers

Forks

Languages