alrakis

This project is about using bark packages in python, just for fun

🐶 Bark - Generate Audio from Text

👥 Speaker Prompts

You can provide certain speaker prompts such as NARRATOR, MAN, WOMAN, etc. Please note that these are not always respected, especially if a conflicting audio history prompt is given.

text_prompt = """
    WOMAN: I would like an oatmilk latte please.
    MAN: Wow, that's expensive!
"""
audio_array = generate_audio(text_prompt)

latte.webm

💻 Installation

pip install git+https://github.com/suno-ai/bark.git

or

git clone https://github.com/suno-ai/bark
cd bark && pip install .

🛠️ Hardware and Inference Speed

Bark has been tested and works on both CPU and GPU (pytorch 2.0+, CUDA 11.7 and CUDA 12.0). Running Bark requires running >100M parameter transformer models. On modern GPUs and PyTorch nightly, Bark can generate audio in roughly realtime. On older GPUs, default colab, or CPU, inference time might be 10-100x slower.

If you don't have new hardware available or if you want to play with bigger versions of our models, you can also sign up for early access to our model playground here.

⚙️ Details

Similar to Vall-E and some other amazing work in the field, Bark uses GPT-style models to generate audio from scratch. Different from Vall-E, the initial text prompt is embedded into high-level semantic tokens without the use of phonemes. It can therefore generalize to arbitrary instructions beyond speech that occur in the training data, such as music lyrics, sound effects or other non-speech sounds. A subsequent second model is used to convert the generated semantic tokens into audio codec tokens to generate the full waveform. To enable the community to use Bark via public code we used the fantastic EnCodec codec from Facebook to act as an audio representation.

Below is a list of some known non-speech sounds, but we are finding more every day. Please let us know if you find patterns that work particularly well on Discord!

[laughter]
[laughs]
[sighs]
[music]
[gasps]
[clears throat]
— or ... for hesitations
♪ for song lyrics
capitalization for emphasis of a word
MAN/WOMAN: for bias towards speaker

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
audio		audio
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio

audio

.gitignore

.gitignore

README.md

README.md

main.py

main.py

Repository files navigation

alrakis

🐶 Bark - Generate Audio from Text

👥 Speaker Prompts

💻 Installation

🛠️ Hardware and Inference Speed

⚙️ Details

About

Releases

Packages

Languages

bujosa/alrakis

Folders and files

Latest commit

History

Repository files navigation

alrakis

🐶 Bark - Generate Audio from Text

👥 Speaker Prompts

💻 Installation

🛠️ Hardware and Inference Speed

⚙️ Details

About

Topics

Resources

Stars

Watchers

Forks

Languages