1
0
mirror of synced 2024-11-24 07:30:16 +01:00

Update README.en.md

Made it seem more human.
This commit is contained in:
Derry Tutt 2023-12-26 07:52:02 -06:00 committed by GitHub
parent 1ff1a183ea
commit 1b680a9690
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -32,26 +32,25 @@ Realtime Voice Conversion GUIgo-realtime-gui.bat
![image](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/143246a9-8b42-4dd1-a197-430ede4d15d7) ![image](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/143246a9-8b42-4dd1-a197-430ede4d15d7)
> The dataset for the pre-training model uses nearly 50 hours of high quality VCTK open source dataset. > The dataset for the pre-training model uses nearly 50 hours of high quality audio from the VCTK open source dataset.
> High quality licensed song datasets will be added to training-set one after another for your use, without worrying about copyright infringement. > High quality licensed song datasets will be added to the training-set often for your use, without having to worry about copyright infringement.
> Please look forward to the pretrained base model of RVCv3, which has larger parameters, more training data, better results, unchanged inference speed, and requires less training data for training. > Please look forward to the pretrained base model of RVCv3, which has larger parameters, more training data, better results, unchanged inference speed, and requires less training data for training.
## Summary ## Features:
This repository has the following features:
+ Reduce tone leakage by replacing the source feature to training-set feature using top1 retrieval; + Reduce tone leakage by replacing the source feature to training-set feature using top1 retrieval;
+ Easy and fast training, even on relatively poor graphics cards; + Easy + fast training, even on poor graphics cards;
+ Training with a small amount of data also obtains relatively good results (>=10min low noise speech recommended); + Training with a small amounts of data (>=10min low noise speech recommended);
+ Supporting model fusion to change timbres (using ckpt processing tab->ckpt merge); + Model fusion to change timbres (using ckpt processing tab->ckpt merge);
+ Easy-to-use Webui interface; + Easy-to-use WebUI;
+ Use the UVR5 model to quickly separate vocals and instruments. + UVR5 model to quickly separate vocals and instruments;
+ Use the most powerful High-pitch Voice Extraction Algorithm [InterSpeech2023-RMVPE](#Credits) to prevent the muted sound problem. Provides the best results (significantly) and is faster, with even lower resource consumption than Crepe_full. + High-pitch Voice Extraction Algorithm [InterSpeech2023-RMVPE](#Credits) to prevent a muted sound problem. Provides the best results (significantly) and is faster with lower resource consumption than Crepe_full;
+ AMD/Intel graphics cards acceleration supported. + AMD/Intel graphics cards acceleration supported;
+ Intel ARC graphics cards acceleration with IPEX supported. + Intel ARC graphics cards acceleration with IPEX supported.
## Preparing the environment ## Preparing the environment
The following commands need to be executed in the environment of Python version 3.8 or higher. The following commands need to be executed with Python 3.8 or higher.
(Windows/Linux) (Windows/Linux)
First install the main dependencies through pip: First install the main dependencies through pip:
@ -166,7 +165,7 @@ You might also need to set these environment variables (e.g. on a RX6700XT):
export ROCM_PATH=/opt/rocm export ROCM_PATH=/opt/rocm
export HSA_OVERRIDE_GFX_VERSION=10.3.0 export HSA_OVERRIDE_GFX_VERSION=10.3.0
```` ````
Also make sure your user is part of the `render` and `video` group: Make sure your user is part of the `render` and `video` group:
```` ````
sudo usermod -aG render $USERNAME sudo usermod -aG render $USERNAME
sudo usermod -aG video $USERNAME sudo usermod -aG video $USERNAME