Cool_Tools/Retrieval-based-Voice-Conversion-WebUI

mirror of synced 2024-11-27 17:00:54 +01:00

History

tzshao 50a121fc74 Update of en_US.json and faq_en.md. Proposal for i18n standard. (#318 ) * Update en_US.json 1. Severe mistake fixed: certain translation is previously incomplete. * Update faq_en.md 1. Modified 1 entry for context consistency with lately merged en_US translation * Update en_US.json 1. Attached colons to all Input Prompts as proposed. 2. Minor changes to translation expressions. * Update en_US.json 1. Removed trailing periods on button texts		2023-05-20 20:14:23 +08:00
..
faiss_tips_en.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
faiss_tips_ja.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
faiss_tips_ko.md	docs(README.ko): add Korean Translation of README.md (#157 )	2023-04-25 21:55:48 +08:00
faq_en.md	Update of en_US.json and faq_en.md. Proposal for i18n standard. (#318 )	2023-05-20 20:14:23 +08:00
faq.md	Add files via upload	2023-05-13 03:49:38 +08:00
README.en.md	Update README.en.md	2023-05-14 07:21:30 +00:00
README.ja.md	add 韓國語	2023-04-28 15:54:12 +08:00
README.ko.han.md	add 韓國語	2023-04-28 15:54:12 +08:00
README.ko.md	add 韓國語	2023-04-28 15:54:12 +08:00
training_tips_en.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
training_tips_ja.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
training_tips_ko.md	docs(README.ko): add Korean Translation of README.md (#157 )	2023-04-25 21:55:48 +08:00
小白简易教程.doc	optimize: 优化代码结构 (#66 )	2023-04-16 06:29:01 +00:00

README.en.md

Retrieval-based-Voice-Conversion-WebUI

An easy-to-use Voice Conversion framework based on VITS.

Changelog | FAQ (Frequently Asked Questions)

English | 中文简体 | 日本語 | 한국어 (韓國語)

Check our Demo Video here!

Realtime Voice Conversion Software using RVC : w-okada/voice-changer

The dataset for the pre-training model uses nearly 50 hours of high quality VCTK open source dataset.

High quality licensed song datasets will be added to training-set one after another for your use, without worrying about copyright infringement.

Summary

This repository has the following features:

Reduce tone leakage by replacing source feature to training-set feature using top1 retrieval;
Easy and fast training, even on relatively poor graphics cards;
Training with a small amount of data also obtains relatively good results (>=10min low noise speech recommended);
Supporting model fusion to change timbres (using ckpt processing tab->ckpt merge);
Easy-to-use Webui interface;
Use the UVR5 model to quickly separate vocals and instruments.

Preparing the environment

We recommend you install the dependencies through poetry.

The following commands need to be executed in the environment of Python version 3.8 or higher:

# Install PyTorch-related core dependencies, skip if installed
# Reference: https://pytorch.org/get-started/locally/
pip install torch torchvision torchaudio

#For Windows + Nvidia Ampere Architecture(RTX30xx), you need to specify the cuda version corresponding to pytorch according to the experience of https://github.com/liujing04/Retrieval-based-Voice-Conversion-WebUI/issues/21
#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

# Install the Poetry dependency management tool, skip if installed
# Reference: https://python-poetry.org/docs/#installation
curl -sSL https://install.python-poetry.org | python3 -

# Install the project dependencies
poetry install

You can also use pip to install the dependencies

Notice: faiss 1.7.2 will raise Segmentation Fault: 11 under MacOS, please use pip install faiss-cpu==1.7.0 if you use pip to install it manually.

pip install -r requirements.txt

Preparation of other Pre-models

RVC requires other pre-models to infer and train.

You need to download them from our Huggingface space.

Here's a list of Pre-models and other files that RVC needs:

hubert_base.pt

./pretrained 

./uvr5_weights

If you want to test the v2 version model (the v2 version model has changed the input from the 256 dimensional feature of 9-layer Hubert+final_proj to the 768 dimensional feature of 12-layer Hubert, and has added 3 period discriminators), you will need to download additional features

./pretrained_v2

#If you are using Windows, you may also need this dictionary, skip if FFmpeg is installed
ffmpeg.exe

Then use this command to start Webui:

python infer-web.py

If you are using Windows, you can download and extract RVC-beta.7z to use RVC directly and use go-web.bat to start Webui.

There's also a tutorial on RVC in Chinese and you can check it out if needed.

README.en.md

Retrieval-based-Voice-Conversion-WebUI

Summary

Preparing the environment

Preparation of other Pre-models

Credits

Thanks to all contributors for their efforts