Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3s极速复刻生成的音频存在多字或少字,少句子的问题 #778

Open
JV-X opened this issue Dec 24, 2024 · 2 comments
Open

Comments

@JV-X
Copy link

JV-X commented Dec 24, 2024

按照目前最新的代码和readme搭建好了环境以后,打开gradio页面,选择3s急速复刻,输入合成文本:一只小羊是养,两只小羊是养,三只小羊就不是养了,是喂。 ,输入prompt文本,录入prompt音频,点击生成按钮,生成出的音频不完整,或者少字或者少句子,偶尔会多字。比如用这个合成文本生成出来的音频就是:音频文件一只小羊是养,两只小羊就不是养,是喂。多了前面的音频文件四个字,少了中间的三只小羊的部分。

注:我用的是CosyVoice-300M-Instruct模型

@shirubei
Copy link

同样300M-Instruct模型,用的prompt语音为5s,使用3s极速复刻功能,用上面的文本(一只小羊是养,两只小羊是养,三只小羊就不是养了,是喂。),一切正常

@liunixgithub
Copy link

我遇到同样的问题,出现少字或少句子的情况

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants