How to convert a tacotron 2 dataset for RH voice thas uses HTS? #632

rmcpantoja · 2022-10-01T03:25:31Z

rmcpantoja
Oct 1, 2022

Hello,
I'm beginning the process of creating a voice to contribute, reading the voice creation page on the wiki and I really don't understand much about the creation of the dataset. I know that they must be .raw files that will be in a wav folder. I assume the transcripts are in a txt file...
The Tacotron 2 dataset is similar, except that it is a wavs folder where all the audios are in wav/22050hz 16 bit mono format, and a list.txt file that contains the transcripts.
here is a dataset example for Tacotron 2.
What do I need to do with this dataset to make it a dataset for rhVoice?
Thanks.

zstanecic · 2022-10-01T05:49:27Z

zstanecic
Oct 1, 2022

Hi, It will succeed maybe, as there are 280 sentences. Can you send me your foma and fst language data files for Spanish to test this on my setup first? My mail: ***@***.***> ***@***.*** You will need to send me your data-only language files. From: Mateo Cedillo ***@***.***> Sent: Saturday, October 1, 2022 5:26 AM To: RHVoice/RHVoice ***@***.***> Cc: Subscribed ***@***.***> Subject: [RHVoice/RHVoice] How to convert a tacotron 2 dataset for RH voice thas uses HTS? (Discussion #632) Hello, I'm beginning the process of creating a voice to contribute, reading the voice creation page on the wiki and I really don't understand much about the creation of the dataset. I know that they must be .raw files that will be in a wav folder. I assume the transcripts are in a txt file... The Tacotron 2 dataset is similar, except that it is a wavs folder where all the audios are in wav/22050hz 16 bit mono format, and a list.txt file that contains the transcripts. here is a dataset example for Tacotron 2 <https://drive.google.com/drive/folders/1BQXLRkjATVZWMXB0zU25J08PHdC429AA?usp=sharing> . What do I need to do with this dataset to make it a dataset for rhVoice? Thanks. — Reply to this email directly, view it on GitHub <#632> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACVCDEY2VVJCDDBBGDIHZB3WA6VLVANCNFSM6AAAAAAQ2IKCLU> . You are receiving this because you are subscribed to this thread. <https://github.com/notifications/beacon/ACVCDE2B2TNM5DBAEGQWDTTWA6VLVA5CNFSM6AAAAAAQ2IKCLWWGG33NNVSW45C7OR4XAZNKIRUXGY3VONZWS33OVJRW63LNMVXHIX3JMTHAAQ5TPY.gif> Message ID: ***@***.*** ***@***.***> >

3 replies

rmcpantoja Oct 1, 2022
Author

Hi, I sent you an email to discuss about this. And well, I have larger TT2 datasets (300+ audios) if needed.

louderpages Oct 10, 2022
Maintainer

Hi, Just noticed your post, . Actually, I have a Castillian Spanish RH voice sitting on the shelf here. Made it from a few hours of public domain, single speaker audio. I think you will need at least 2000 sentences, maybe 20,000 words.

Sorry, I have not had time to download and listen to the wavs. Is it single speaker? Which country?

You will need the FST's for Spanish, as Zvonomir says. But Spanish is the easiest of all languages to process.

rmcpantoja Oct 19, 2022
Author

Hi, Just noticed your post, . Actually, I have a Castillian Spanish RH voice sitting on the shelf here. Made it from a few hours of public domain, single speaker audio. I think you will need at least 2000 sentences, maybe 20,000 words.

Sorry, I have not had time to download and listen to the wavs. Is it single speaker? Which country?

You will need the FST's for Spanish, as Zvonomir says. But Spanish is the easiest of all languages to process.

Hello! Do you mean that you already have a voice trained in Spanish for RHVoice? I'm currently researching about g2p in Spanish, and above all, the foma language, so it will take a while.
Out of curiosity, what is the dataset you trained with? Isn't it the dataset I'm thinking of from kaggle that contains the 3 audiobooks, "19 de marzo", "bailén" and "batalla arapiles"?
My tacotron dataset from the first post is auronplay, Spanish from Spain. 280 audios, 48 minutes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to convert a tacotron 2 dataset for RH voice thas uses HTS? #632

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to convert a tacotron 2 dataset for RH voice thas uses HTS? #632

rmcpantoja Oct 1, 2022

Replies: 1 comment · 3 replies

zstanecic Oct 1, 2022

rmcpantoja Oct 1, 2022 Author

louderpages Oct 10, 2022 Maintainer

rmcpantoja Oct 19, 2022 Author

rmcpantoja
Oct 1, 2022

Replies: 1 comment 3 replies

zstanecic
Oct 1, 2022

rmcpantoja Oct 1, 2022
Author

louderpages Oct 10, 2022
Maintainer

rmcpantoja Oct 19, 2022
Author