Update README.md

2024-11-24 15:30:11 +01:00 · 2020-11-10 02:19:02 -06:00 · 2020-11-10 02:19:02 -06:00 · 67f95cfcaf
commit 67f95cfcaf
parent 7b8def7d69
1 changed files with 49 additions and 24 deletions
--- a/README.md
+++ b/README.md
@ -14,9 +14,11 @@ The application was made with Tkinter for cross platform compatibility, so this
 ### Install Required Applications & Packages
-1. Download & install Python 3.7 [here](https://www.python.org/ftp/python/3.6.8/python-3.6.8-amd64.exe) (Make sure to check the box that says "Add Python 3.7 to PATH" if you're on Windows)
+1. Download & install Python 3.7 [here](https://www.python.org/ftp/python/3.6.8/python-3.6.8-amd64.exe)
    - Make sure to check the box that says "Add Python 3.7 to PATH" if you're on Windows
 2. Once Python has installed, download Ultimate Vocal Remover GUI Version 4.1.0 here (link pending)
-3. Place the UVR-V4GUI folder contained within the *.zip* file where ever you wish (your documents folder is recommended for ease of access).
+3. Place the UVR-V4GUI folder contained within the *.zip* file where ever you wish. 
    - Your documents folder is recommended for ease of access.
 4. From the UVR-V4GUI directory, open the Windows Command Prompt and run the following installs -
 ```
@ -27,8 +29,8 @@ pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pyto
 ### Running the Vocal Remover GUI & Models
 1. Open the file labeled *'VocalRemover.py'*.
-  - It's recommended that you create a shortcut for the file labeled *'VocalRemover.py'* to your desktop for easy access.
+    - It's recommended that you create a shortcut for the file labeled *'VocalRemover.py'* to your desktop for easy access.
-    - If you are having issues opening the *'VocalRemover.py'* file, please go to the **troubleshooting** section below.
+      - If you are having issues opening the *'VocalRemover.py'* file, please go to the **troubleshooting** section below.
 ## Option Guide
@ -39,32 +41,55 @@ pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pyto
 ### Model Selections:
 The v2 & v4 AI engines use different sets of models. The available models for each engine will automatically populate within the model selection dropdowns based on which engine was selected. 
 - **Choose Main Model** - Here is where you choose the main model to convert your tracks with.
- **Choose Stacked Model** - These models are meant to clean up vocal residue left over in the form of vocal pinches and static. The stacked models provided are only meant for instrumental outputs generated by a track that ran through one of the main models. Selecting the *'Stack Passes'* option will enable you to select a stacked model to run with the main model. If you wish to only run a stacked model on a track, make sure the *'Stack Conversion Only* option is checked.
+  - Each of the models provided were trained on different parameters, though they can convert tracks of all genres. 
- The v2 & v4 AI engines use different sets of models. The available models for each engine will automatically populate within the model selection dropdowns based on which engine was selected. 
+  - The variety of models allows the user the chance to determine which one works best for the type of music they're converting.
     - The *'Model Test Option'* will allow the user to more easily determine which model is best for the track(s) being converted.
 - **Choose Stacked Model** - These models are meant to clean up vocal artifacts from instrumental outputs. 
  - The stacked models provided are only meant to process instrumental outputs created by a main model. 
  - Selecting the *'Stack Passes'* option will enable you to select a stacked model to run with the main model. 
    - If you wish to only run a stacked model on a track, make sure the *'Stack Conversion Only'* option is checked.
  - The varying main model/stacked model combination options allows the user more flexibility in finding what blend works best for the track(s) they are proessing.
    - To reiterate, the *'Model Test Option'* makes testing different model blends easier on the user, that's what it is there for.
 ### Parameter Values:
-All models released here will have the values they were trained with appended to the end of the filename like so "MGM-HIGHEND_sr44100_hl512_w512_nf2048.pth". The "sr44100_hl512_w512_nf2048" portion automatically sets those values within the application, so please do not change the model files names. If there are no values appended to the end of a model, the value fields will be editable and auto-populate with default values. The default values are - 
+All models released here will have the values they were trained with appended to the end of their filenames like so, **'MGM-HIGHEND_sr44100_hl512_w512_nf2048.pth'**. The *'_sr44100_hl512_w512_nf2048'* portion automatically sets the *SR*, *HOP LENGNTH*, *WINDOW SIZE*, & *N_FFT* values within the application, so please do not change the model files names. If there are no values appended to the end of the models' filename, the value fields will be editable and auto-populate with default values. 
- **SR** - 44100
+- **Default Values:**
- **HOP LENGTH** - 1024
+  - **SR** - 44100
- **WINDOW SIZE** - 512
+  - **HOP LENGTH** - 1024
- **N_FFT** - 2048
+  - **WINDOW SIZE** - 512
  - **N_FFT** - 2048
 ### Checkboxes:
- **GPU Conversion** - This option ensures the GPU is used for conversions. It will not work if you don't have a Cuda compatible GPU (Nividia GPU's are most compatible with Cuda).
+- **GPU Conversion** - Seclecting this option ensures the GPU is used for conversions. 
- **Post-process** - This option can potentially identify left over instrumental artifacts in the vocal outputs. This option may improve the separation on some songs. I recommend only using it if conversions don't come out well.
+  - NOTE: It will not work if you don't have a Cuda compatible GPU (Nividia GPU's are most compatible with Cuda).
- **TTA** - This option performs Test-Time-Augmentation to improve the separation quality. However, having this selected will prolong the time it takes to complete a conversion. *Please note, this option is NOT compatible with the v2 AI engine.*
+- **Post-process** - This option can potentially identify left over instrumental artifacts within the vocal outputs. This option may improve the separation on *some* songs. 
- **Output Image** - This option will include a spectrogram of the instrumental & vocal track outputs.
+  - NOTE: Having this option selected can have an adverse effect on the conversion process, depending on the track. Because of this, it's recommended as a last resort.
- **Stack Passes** - This option allows you to set the number of times you would like a track to run through a stacked model.
+- **TTA** - This option performs Test-Time-Augmentation to improve the separation quality. 
- **Stack Conversion Only** - Selecting this will allow you to bypass the main model and run a track through a stacked model only.
+  - Having this selected will increase the time it takes to complete a conversion.
- **Save All Stacked Outputs** - If you are performing a stacked conversion, having this option selected will auto-generate a new directory to your *'Save to'* path with the track name. The new directory will contain all of the outputs generated by the whole conversion process. The amount of outputs will depend on how many stack passes you chose.
+  - This option is NOT compatible with the v2 AI engine.
- **Model Test Mode** - This option is meant to make it easier for users to test the results of different models without having to manually create new folders and/or change the filenames. When it's selected, the application will auto-generate a new folder with the name of the selected model(s) in the *'Save to'* path you have chosen. The instrumental & vocal outputs will have the selected model(s) name(s) appended to them and save to the auto-generated directory.
+- **Output Image** - Selecting this option will include the spectrograms of the instrumental & vocal track audio outputs.
 - **Stack Passes** - This option allows the user to set the number of times a track is to run through a stacked model.
  - The best range is 3-5 passes, anymore can cause quality degradation of the track.
 - **Stack Conversion Only** - Selecting this option allows the user to bypass the main model and run a track through a stacked model only.
 - **Save All Stacked Outputs** - Having this option selected will auto-generate a new directory to your *'Save to'* path with the track name. The new directory will contain all of the outputs generated by the whole conversion process. The amount of audio outputs will depend on the input number of stack passes.  
  - Each output filename will be appended with the number of passes it has had.
    - For example, if you choose 5 stack passes this option will provide you with 5 pairs of audio outputs generated after each pass.
  - This option can be very useful in determining the optimal number of passes needed to clean a track.
 - **Model Test Mode** - This option is meant to make it easier for users to test the results of different models, and model combinations, without having to manually create new folders and/or change the filenames. 
  - When this option is selected the application will auto-generate a new folder with the name of the selected model(s) in the *'Save to'* path you have chosen.
    - The instrumental & vocal outputs filenames will have the selected model(s) name(s) appended to them and save to the auto-generated directory.
 ### Other Buttons:
- **Add New Model** - This button will automatically take you to the models folder. If you are adding a new model, make sure to add it accordingly based on the AI engine it was trained on! All new models added will automatically be detected without having to restart the application.
+- **Add New Model** - This button will automatically take you to the models folder. 
  - If you are adding a new model, make sure to add it accordingly based on the AI engine it was trained on!
    - If you wish to add a model trained on the v4 engine, add it to the correct folder located in the 'v4' directory.
  - The application will automatically detect any models added without having to restart the application.
 - **Restart Button** - If the application hangs for any reason, you can hit the circular arrow button immediately to the right of the *'Start Conversion'* button.
 ## Models Included:
@ -79,7 +104,7 @@ A special thank you to aufr33 for helping expand the dataset used to train these
 ## Troubleshooting:
- If the VocalRemover.py file won't open *under any circumstances* and all other resources have been exhausted, please do the following - 
+- If the *'VocalRemover.py'* file won't open *under any circumstances* and all other resources have been exhausted, please do the following - 
 1. Open the cmd prompt from the UVR-V4GUI directory
 2. Run the following command - 
@ -91,9 +116,9 @@ python VocalRemover.py
 ## Other GUI Notes:
 - The application will automatically remember your *'save to'* path upon closing and reopening until you change it.
- You can select as many files as you like. Multiple conversions are supported!
+  - It will also remember the last directory you accessed to select music.
- The ability to drag & drop files to convert has also been added.
+- Multiple conversions are supported.
- The Stacked Model is meant to clean up vocal residue left over in the form of vocal pinches and static. The stacked models provided are only meant for instrumental outputs from track run through one of the main models.
+- The ability to drag & drop audio files to convert has also been added.
 - Conversion times will greatly depend on your hardware. This application will NOT be friendly to older or budget hardware. Please proceed with caution! Pay attention to your PC and make sure it doesn't overheat.
 ## References