Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colab/Jupyter notebook to create voices for RHVoice interactively in the cloud, based in Wiki #672

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

rmcpantoja
Copy link
Contributor

@rmcpantoja rmcpantoja commented Dec 10, 2022

This notebook can be used via Jupyter Notebook (in local) or via Google Colab (through machines in the cloud).

This notebook was created for the purpose of making the voices more interactive so that we can run these required steps instantly. Test it in colab!

license

I license this contribution under the terms set out in the Unlicense license.

@cla-bot
Copy link

cla-bot bot commented Dec 10, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@cla-bot
Copy link

cla-bot bot commented Dec 11, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@rmcpantoja
Copy link
Contributor Author

I license this contribution under the terms set out in the Unlicense license.

@cla-bot
Copy link

cla-bot bot commented Dec 11, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@cla-bot
Copy link

cla-bot bot commented Dec 14, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@cla-bot
Copy link

cla-bot bot commented Dec 14, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@ZachB100
Copy link

Hey man, thanks so much for this notebook. For the most part everything was working up until I had to change the model settings. When I press run, I get the following output.
"Traceback (most recent call last):
File "../RHVoice/src/scripts/general/voice-building-utils", line 1720, in
args.func(args)
File "../RHVoice/src/scripts/general/voice-building-utils", line 166, in call
params=self.get_configure_params()
File "../RHVoice/src/scripts/general/voice-building-utils", line 155, in get_configure_params
params.update(self.get_analysis_params())
File "../RHVoice/src/scripts/general/voice-building-utils", line 110, in get_analysis_params
params["BAPORDER"]=len(self.get_filter_band_edges())
File "../RHVoice/src/scripts/general/voice-building-utils", line 139, in get_filter_band_edges
nyq_freq=sr//2
TypeError: unsupported operand type(s) for //: 'str' and 'int'"
How do I get around this?

@rmcpantoja
Copy link
Contributor Author

@ZachB100 It seems to me that this is due to the sampling rate or is it an internal error of the script that is executed (voice building utils) Can you give me your settings please? Maybe something is wrong in your training.cfg.

@zstanecic
Copy link
Contributor

zstanecic commented Dec 17, 2022 via email

@cla-bot
Copy link

cla-bot bot commented Dec 17, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@ZachB100
Copy link

I was just following all of the steps in the notebook exactly. I change the parameters using the pop-up menus next to each one, I did not edit the training.cfg file. I'm guessing this is some internal error that is out of our control, so in that case is there a way to get an older version of RH that doesn't exhibit this behavior? All of the steps before this were successful, this is the only part where I'm getting stuck.

@rmcpantoja
Copy link
Contributor Author

rmcpantoja commented Dec 17, 2022

@ZachB100 Sorry my bad. It's an internal error that I just fixed. In the settings cell, below it press the show code button, go to the code editor and replace this original line:

!jq --arg pwd "/content/tts" '.wavedir=$pwd+"/wav"|.speaker="$speaker_name"|.language="$language"|.gender="$gender"|.sample_rate="$sample_rate"' training.cfg >training2.cfg &&mv training2.cfg training.cfg

Replace it with:

!jq --arg pwd "/content/tts" '.wavedir=$pwd+"/wav"|.speaker="$speaker_name"|.language="$language"|.gender="$gender"|.sample_rate=$sample_rate' training.cfg >training2.cfg &&mv training2.cfg training.cfg

@ZachB100
Copy link

Awesome, thank you so much, that was fast :-). I'll give this a shot and let you know how it goes. I'm really excited, I've mostly only messed with machine learning-based text to speech training, so I'm curious to see what HTS will produce. When trained in Colab, is it possible to create an NVDA add-on from there, or would I have to modify the exported model on a Windows system for that to happen? I'm really new to all of this, so sorry for all the questions lol. Thanks again!

@rmcpantoja
Copy link
Contributor Author

Yes, an NVDA addon can be created via colab, and it is possible with SCons after exporting the voice. I think for now you'll have to download your voice data manually. I will add support for saving RHVoice work to Drive in the future and you won't have to worry, although the downside is that you would need a lot of space, it depends on the size of the dataset.

@ZachB100
Copy link

All right, so I was able to get past the model settings with no problem, however when attempting to guess F0 range I get this.
"/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/usr/local/lib/python3.8/dist-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
/usr/local/lib/python3.8/dist-packages/numpy/core/_methods.py:262: RuntimeWarning: Degrees of freedom <= 0 for slice
ret = _var(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/usr/local/lib/python3.8/dist-packages/numpy/core/_methods.py:222: RuntimeWarning: invalid value encountered in true_divide
arrmean = um.true_divide(arrmean, div, out=arrmean, casting='unsafe',
/usr/local/lib/python3.8/dist-packages/numpy/core/_methods.py:254: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "../RHVoice/src/scripts/general/voice-building-utils", line 1720, in
args.func(args)
File "../RHVoice/src/scripts/general/voice-building-utils", line 894, in call
min_f0=int(numpy.round(numpy.exp(m-d)))
ValueError: cannot convert float NaN to integer"

@rmcpantoja
Copy link
Contributor Author

@ZachB100 I think the range f0 of your dataset cannot be identified. That being the case you can manually set the range to training.cfg. You could try with a minimum of 110, a maximum of 280

@ZachB100
Copy link

ZachB100 commented Dec 17, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 18, 2022 via email

@zstanecic
Copy link
Contributor

zstanecic commented Dec 18, 2022 via email

@zstanecic
Copy link
Contributor

zstanecic commented Dec 18, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 18, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 19, 2022 via email

@zstanecic
Copy link
Contributor

zstanecic commented Dec 19, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 19, 2022 via email

@rmcpantoja
Copy link
Contributor Author

rmcpantoja commented Dec 20, 2022

@ZachB100 Thanks for your message about the bugs. Indeed, I do not guarantee that the resynthesize part will work correctly. That is, the resynthesis can be generated, but my intention is to show at least the results of 5 audios that were resynthesised and I did it in the notebook, but apparently there is an error in that part that I cannot discover.
As for SSML, actually writing the file with %%writefile is an alternative that I have found useful as printf has given me some trouble and it has worked for me. By the way, what version of colab are you using? That is, are you connecting to a machine hosted at colab.research.google.com? Thanks again!
BTW, @zstanecic @grzezlo Could you review this notebook, please? I would need more testers to see what can be fixed. I will try to correct the resynthesis part and likewise, train SLT as a test. Thanks!

@ZachB100
Copy link

ZachB100 commented Dec 20, 2022 via email

@cla-bot
Copy link

cla-bot bot commented Dec 20, 2022

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@rmcpantoja
Copy link
Contributor Author

@ZachB100 Thank you for your words! And yes, I tried to fix a lot of errors in the notebook recently, but I do not guarantee that it works correctly. I have been training a Jack the Ripper dataset and I have had some errors in the "labelling" part, but it must be because of my dataset or because there are badly set paths. As I say, I would be needing more testers for this notebook to be able to work better on bug fixes.

@ZachB100
Copy link

ZachB100 commented Dec 21, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 21, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 24, 2022 via email

@ZachB100
Copy link

ZachB100 commented Dec 25, 2022 via email

@cla-bot
Copy link

cla-bot bot commented Mar 30, 2023

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

@cla-bot
Copy link

cla-bot bot commented Jul 26, 2023

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

1 similar comment
@cla-bot
Copy link

cla-bot bot commented Jul 26, 2023

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

Copy link

cla-bot bot commented Dec 31, 2023

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

Copy link

cla-bot bot commented Apr 17, 2024

An explicit license to your contribution may be needed. For more information, please visit https://github.com/RHVoice/contrib-licensing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants