🍴 Forked from zhangyongmao/VISinger2
📌 This clone repo is created as GitHub does not support Git LFS on public fork
Fixed PyTorch 2.0.1 compatibility issues, now VISinger 2 can be run in cuda12.
- Add FastAPI server for serving inference
- Added Cantonese singing data for training
- Added Cantonese pre-trained model
- Install python requirements: pip install -r requirements.txt
- Download the Opencpop Dataset.
- prepare data like data/opencpop (wavs, train.txt, test.txt, train.list, test.list)
- To generate train.list, use
awk -F'|' '{print $1}' train.txt > train.list
- To generate test.list, use
awk -F'|' '{print $1}' test.txt > test.list
- modify the egs/visinger2/config.json (data/data_dir, train/save_dir)
cd egs/visinger2
bash bash/preprocess.sh config.json
cd egs/visinger2
bash bash/train.sh 0
We trained the model for 500k steps with batch size of 16.
modify the model_dir, input_dir, output_dir in inference.sh
cd egs/visinger2
bash bash/inference.sh
Some audio samples can be found in demo website and bilibili.
The pre-trained model trained using opencpop is here, the config.json is here, and the result of the test set synthesized by this pre-trained model is here.