-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The cloned voice is far from the reference speaker #220
Comments
Dose source_se need to be from audio of the same person's voice as source audio to inference to get close or better clone quality? |
I got the following warnings, could any of those warnings make the clone similarity to drastically degrade ?
|
I tried to use same (base-speaker) person's voice/mp3 for getting "source_se/tone color embedding" and "source audio to inference" , and a third male voice/mp3 as reference speaker. The resulting cloned audio, which sometime is female with a bit noise, is still far from the reference male audio. Very Bizarred ! so, to my conclusion from the experiment, the source_se and source audio to inference don't have to be from same person, or at least, it doesn't matter towards affecting/improving clone similarity. just a couplel of sents to share ... have fun Sean |
Hi,
I am trying out Open Voice (v1), and it mechanically worked, but the cloned voice is far from its reference speaker. Sometimes, I gave a male reference speaker mp3, and got back a female voice.
I run the code from "demo_part1.ipynb" and I only changed reference speaker's mp3.
I suspect the torch/embedding version is not compatible, and I am using:
(Speech2Rag) OpenVoice> pip show torch
Name: torch
Version: 2.1.2+cu121
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: C:\Users\Sean2092\miniconda3\Lib\site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: pytorch-lightning, torchaudio, torchmetrics, torchvision
Could someone with success and experience help out? I am sure I got something, libs or settings, incorrect, but I cannot figure out what that might be. Pls help.
Thanks a lot,
Sean
The text was updated successfully, but these errors were encountered: