Update README.md

This commit is contained in:
Anjok07 2020-11-10 03:46:17 -06:00 committed by GitHub
parent 7828e66ac4
commit d20d8c3be7
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -10,12 +10,12 @@
This application is a GUI version of the vocal remover AI's created and posted by GitHub user tsurumeso. You can find tsurumeso's original command line version [here](https://github.com/tsurumeso/vocal-remover). Please note that we do not maintain or directly support any of tsurumesos AI application code. Direct support and development for the **Ultimate Vocal Remover GUI** is only maintained within this repository.
- **Special Thanks**
- [tsurumeso](https://github.com/tsurumeso) - The engeneer who authored the AI code. Thank you for the hard work and dedication put into the AI application this GUI is built around!
- [tsurumeso](https://github.com/tsurumeso) - The engineer who authored the AI code. Thank you for the hard work and dedication put into the AI application this GUI is built around!
- [DilanBoskan](https://github.com/DilanBoskan) - The main GUI code contributor, thank you for helping bring this GUI to life, your hard work and continued support is greatly appreciated!
## Installation
The application was made with Tkinter for cross platform compatibility, so this should work with Windows, Mac, and Linux systems. This application has only been tested on Windows 10 & Linux Ubuntu.
The application was made with Tkinter for cross-platform compatibility, so this should work with Windows, Mac, and Linux systems. This application has only been tested on Windows 10 & Linux Ubuntu.
### Install Required Applications & Packages
@ -35,14 +35,14 @@ pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pyto
- Open the file labeled *'VocalRemover.py'*.
- It's recommended that you create a shortcut for the file labeled *'VocalRemover.py'* to your desktop for easy access.
- If you are unabled to open the *'VocalRemover.py'* file, please go to the **troubleshooting** section below.
- If you are unable to open the *'VocalRemover.py'* file, please go to the **troubleshooting** section below.
## Option Guide
### Choose AI Engine:
- This option allows you to toggle between tsurumeso's v2 & v4 AI engines.
- **Please note**, The TTA option and the ability to set the N_FFT value are strictly **v4*** options.
- **Please note**, The TTA option and the ability to set the N_FFT value is only available for the v4 engine.
### Model Selections:
@ -70,18 +70,18 @@ All models released here will have the values they were trained with appended to
- **N_FFT** - 2048
### Checkboxes
- **GPU Conversion** - Seclecting this option ensures the GPU is used for conversions.
- **GPU Conversion** - Selecting this option ensures the GPU is used for conversions.
- NOTE: It will not work if you don't have a Cuda compatible GPU (Nividia GPU's are most compatible with Cuda).
- **Post-process** - This option can potentially identify left over instrumental artifacts within the vocal outputs. This option may improve the separation on *some* songs.
- NOTE: Having this option selected can have an adverse effect on the conversion process, depending on the track. Because of this, it's recommended as a last resort.
- **Post-process** - This option can potentially identify leftover instrumental artifacts within the vocal outputs. This option may improve the separation on *some* songs.
- **NOTE:** Having this option selected can potentially have an adverse effect on the conversion process, depending on the track. Because of this, it's only recommended as a last resort.
- **TTA** - This option performs Test-Time-Augmentation to improve the separation quality.
- Having this selected will increase the time it takes to complete a conversion.
- This option is NOT compatible with the v2 AI engine.
- **Output Image** - Selecting this option will include the spectrograms of the instrumental & vocal track audio outputs.
- This option is ***not*** compatible with the *v2 AI engine*.
- **Output Image** - Selecting this option will include the images of the spectrograms for the instrumental & vocal audio outputs.
- **Stack Passes** - This option allows the user to set the number of times a track is to run through a stacked model.
- The best range is 3-5 passes, anymore can cause quality degradation of the track.
- The best range is 3-5 passes, any more than 5 can cause quality degradation of the track.
- **Stack Conversion Only** - Selecting this option allows the user to bypass the main model and run a track through a stacked model only.
- **Save All Stacked Outputs** - Having this option selected will auto-generate a new directory to your *'Save to'* path with the track name. The new directory will contain all of the outputs generated by the whole conversion process. The amount of audio outputs will depend on the input number of stack passes.
- **Save All Stacked Outputs** - Having this option selected will auto-generate a new directory to the *'Save to'* path with the track name. The new directory will contain all of the outputs generated by the whole conversion process. The amount of audio outputs will depend on the input number of stack passes.
- Each output filename will be appended with the number of passes it has had.
- For example, if you choose 5 stack passes this option will provide you with 5 pairs of audio outputs generated after each pass.
- This option can be very useful in determining the optimal number of passes needed to clean a track.
@ -93,19 +93,29 @@ All models released here will have the values they were trained with appended to
- **Add New Model** - This button will automatically take you to the models folder.
- If you are adding a new model, make sure to add it accordingly based on the AI engine it was trained on!
- If you wish to add a model trained on the v4 engine, add it to the correct folder located in the 'v4' directory.
- The application will automatically detect any models added without having to restart the application.
- For example, if you wish to add a model trained on the v4 engine, add it to the correct folder located in the 'v4' directory.
- The application will automatically detect any models added the correct directories without needing a restart.
- **Restart Button** - If the application hangs for any reason, you can hit the circular arrow button immediately to the right of the *'Start Conversion'* button.
## Models Included
***PLEASE NOTE: Please do not change the name of the models provided! The required perameters are specified in the filenames.***
**PLEASE NOTE:** Do not change the name of the models provided! The required parameters are specified and appended to the end of the filenames.
Here's a list of the models included within the package -
Here's a list of the models included within the package -
- *(list pending)*
- **v2 AI Engine**
- **Main Models**
- *(list pending)*
- **Stacked Models**
- *(list pending)*
- **v4 AI Engine**
- **Main Models**
- *(list pending)*
- **Stacked Models**
- *(list pending)*
A special thank you to aufr33 for helping expand the dataset used to train these models and for the dilligent advice!
A special thank you to aufr33 for helping me expand the dataset used to train these models and for the helpful training tips.
## Other GUI Notes
@ -113,7 +123,8 @@ A special thank you to aufr33 for helping expand the dataset used to train these
- It will also remember the last directory you accessed to select files to be processed.
- Multiple conversions are supported.
- The ability to drag & drop audio files to convert has also been added.
- Conversion times will greatly depend on your hardware. This application will NOT be friendly to older or budget hardware. Please proceed with caution! Pay attention to your PC and make sure it doesn't overheat. We are not responsible for for any hardware damage.
- Conversion times will greatly depend on your hardware.
- This application will *not* be friendly to older or budget hardware. Please proceed with caution! Pay attention to your PC and make sure it doesn't overheat. ***We are not responsible for any hardware damage.***
## Troubleshooting
@ -132,7 +143,7 @@ The **Ultimate Vocal Remover GUI** code is [MIT-licensed](LICENSE).
## Contributing
For anyone interested in the ongoing develpment of **Ultimate Vocal Remover GUI** please send us a pull request and we will review it. This project is 100% open-source and free for anyone to use and/or modify as they wish.
For anyone interested in the ongoing development of **Ultimate Vocal Remover GUI** please send us a pull request and we will review it. This project is 100% open-source and free for anyone to use and/or modify as they wish.
## References
- [1] Takahashi et al., "Multi-scale Multi-band DenseNets for Audio Source Separation", https://arxiv.org/pdf/1706.09588.pdf