Cool_Tools/Retrieval-based-Voice-Conversion-WebUI

mirror of synced 2024-11-28 01:10:56 +01:00

History

Naozumi d82b2cfc14 Update readme (#897 )		2023-07-29 22:44:36 +08:00
..
Changelog_CN.md	优化代码结构	2023-06-24 15:26:14 +08:00
Changelog_EN.md	优化代码结构	2023-06-24 15:26:14 +08:00
Changelog_KO.md	优化代码结构	2023-06-24 15:26:14 +08:00
faiss_tips_en.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
faiss_tips_ja.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
faiss_tips_ko.md	docs(README.ko): add Korean Translation of README.md (#157 )	2023-04-25 21:55:48 +08:00
faq_en.md	Update faq_en.md	2023-07-26 14:39:50 +08:00
faq.md	Update faq.md	2023-07-26 14:39:18 +08:00
README.en.md	Update readme (#897 )	2023-07-29 22:44:36 +08:00
README.ja.md	优化代码结构	2023-06-24 15:26:14 +08:00
README.ko.han.md	优化代码结构	2023-06-24 15:26:14 +08:00
README.ko.md	优化代码结构	2023-06-24 15:26:14 +08:00
training_tips_en.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
training_tips_ja.md	update training tips and faiss tips (#208 )	2023-04-30 22:26:25 +08:00
training_tips_ko.md	docs(README.ko): add Korean Translation of README.md (#157 )	2023-04-25 21:55:48 +08:00
小白简易教程.doc	optimize: 优化代码结构 (#66 )	2023-04-16 06:29:01 +00:00

README.en.md

Retrieval-based-Voice-Conversion-WebUI

An easy-to-use Voice Conversion framework based on VITS.

Changelog | FAQ (Frequently Asked Questions)

English | 中文简体 | 日本語 | 한국어 (韓國語)

Check our Demo Video here!

Realtime Voice Conversion Software using RVC : w-okada/voice-changer

A online demo using RVC that convert Vocal to Acoustic Guitar audio：https://huggingface.co/spaces/lj1995/vocal2guitar

Vocal2Guitar demo video：https://www.bilibili.com/video/BV19W4y1D7tT/

The dataset for the pre-training model uses nearly 50 hours of high quality VCTK open source dataset.

High quality licensed song datasets will be added to training-set one after another for your use, without worrying about copyright infringement.

Summary

This repository has the following features:

Reduce tone leakage by replacing the source feature to training-set feature using top1 retrieval;
Easy and fast training, even on relatively poor graphics cards;
Training with a small amount of data also obtains relatively good results (>=10min low noise speech recommended);
Supporting model fusion to change timbres (using ckpt processing tab->ckpt merge);
Easy-to-use Webui interface;
Use the UVR5 model to quickly separate vocals and instruments.
Use the most powerful High-pitch Voice Extraction Algorithm InterSpeech2023-RMVPE to prevent the muted sound problem. Provides the best results (significantly) and is faster, with even lower resource consumption than Crepe_full.

Preparing the environment

The following commands need to be executed in the environment of Python version 3.8 or higher.

(Windows/Linux) First install the main dependencies through pip:

# Install PyTorch-related core dependencies, skip if installed
# Reference: https://pytorch.org/get-started/locally/
pip install torch torchvision torchaudio

#For Windows + Nvidia Ampere Architecture(RTX30xx), you need to specify the cuda version corresponding to pytorch according to the experience of https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/21
#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

Then can use poetry to install the other dependencies:

# Install the Poetry dependency management tool, skip if installed
# Reference: https://python-poetry.org/docs/#installation
curl -sSL https://install.python-poetry.org | python3 -

# Install the project dependencies
poetry install

You can also use pip to install them:

pip install -r requirements.txt

Mac users can install dependencies via run.sh:

sh ./run.sh

Preparation of other Pre-models

RVC requires other pre-models to infer and train.

You need to download them from our Huggingface space.

Here's a list of Pre-models and other files that RVC needs:

hubert_base.pt

./pretrained 

./uvr5_weights

If you want to test the v2 version model (the v2 version model has changed the input from the 256 dimensional feature of 9-layer Hubert+final_proj to the 768 dimensional feature of 12-layer Hubert, and has added 3 period discriminators), you will need to download additional features

./pretrained_v2

#If you are using Windows, you may also need this dictionary, skip if FFmpeg is installed
ffmpeg.exe

Then use this command to start Webui:

python infer-web.py

If you are using Windows or macOS, you can download and extract RVC-beta.7z to use RVC directly by using go-web.bat on windows or sh ./run.sh on macOS to start Webui.

There's also a tutorial on RVC in Chinese and you can check it out if needed.

Credits

ContentVec
VITS
HIFIGAN
Gradio
FFmpeg
Ultimate Vocal Remover
audio-slicer
Vocal pitch extraction:RMVPE
- The pretrained model is trained and tested by yxlllc and RVC-Boss.