Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to finetune a medium quality checkpoint to create a high quality checkpoint? #647

Open
yilmazay74 opened this issue Nov 21, 2024 · 1 comment

Comments

@yilmazay74
Copy link

Hi,
So far I have created some medium quality ones using the Turkish sample checkpoints provided on 'download voices' page.
There is a nice example with was trained until appr. 5600 steps and when I fine tune it with about 300 our own samples until 10k steps it gives not bad results. However, we want to have much better quality.
So I wanted to try high quality.
When I tried to fine tune sample medium quality voice it throws a lot of mismatch errors.
So I thought it looks like it is not possible to fine tune from a different quality level. But I am not sure.
Currently I am trying to finetune an English high quality checkpoint at 2000 steps with my own Turkish samples of about 300 files.
(since there is no high quality checkpoint sample for Turkish language, I thought english would be the best choice to start from somewhere)
However, I am not very optimistic about it.
Could anyone quide me what is the best way for me to create better quality tts models?
Thanks in advance.

@danielizham
Copy link

According to this part of the tutorial referenced in the training guide, the quality of the base model has to be the same as that of the finetuned one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants