Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training wavenet to rap? #410

Open
constantinethegr8 opened this issue May 16, 2023 · 0 comments
Open

Training wavenet to rap? #410

constantinethegr8 opened this issue May 16, 2023 · 0 comments

Comments

@constantinethegr8
Copy link

So I heard tacotron 2 needs very little data 100-300 sentences for good sounding speech. However, it has bad tempo shit. I've seen wavenet can be curated for music and I wondered if the model can be conditioned to tts with rhythm. Even if it it is possible (hopefully), I have heard it requires large amounts of data in the 10's of GB's. Can wavenet can be trained with only 1-2 GB maybe no more than 4GB to get good results? And if it can, how does one prepare a dataset (like condition it)? So I chop audio or spit it in to each line the rapper spoke or give the full acapella? Do I use one wave file or multiple (oh what audio format and number of channels and sample rate)? Sorry, I am extremely new. Any help would be appreciated. Thanks.

Flavius Valerius Constantinus, The Last Roman Emperor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant