This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).
These bundles contain the UVR interface, Python, PyTorch, and other dependencies needed to run the application effectively. No prerequisites are required.
If you do not have FFmpeg or Rubber Band installed and want to avoid going through the process of installing them the long way, follow the instructions below.
**FFmpeg Installation**
- Download the precompiled build [here](https://www.gyan.dev/ffmpeg/builds/ffmpeg-release-essentials.zip)
- From the archive, extract the following file to the UVR application directory:
<summary>MacOS Users: Having Trouble Opening UVR?</summary>
> Due to Apples strict application security, you may need to follow these steps to open UVR.
>
> First, run the following command via Terminal.app to allow applications to run from all sources (it's recommended that you re-enable this once UVR opens properly.)
>
> ```bash
> sudo spctl --master-disable
> ```
>
> Second, run the following command to bypass Notarization:
- Once everything is done installing, download the correct FFmpeg binary for your system [here](http://www.osxexperts.net) and place it into the main application directory.
- Users can now drop large batches of files and directories as inputs. When directories are dropped, the application will search for any file with an audio extension and add it to the list of inputs.
- Fixed low-resolution icon.
- Added the ability to download models manually if the application can't connect to the internet.
- "Select Saved Settings" option - Allows the user to save the current settings of the whole application. You can also load saved settings or reset them to the default.
- "Right-click" menu - Allows for quick access to important options.
- "Help Hints" option - When enabled, users can hover over options to see pop-up text that describes that option. The right-clicking option also allows copying the "Help Hint" text.
- Secondary Model Mode - This option is an expanded version of the "Demucs Model" option only available to MDX-Net. Except now, this option is available in all three AI Networks and for any stem. Any model can now be Secondary, and the user can choose the amount of influence it has on the final result.
- Clicking the "Input" field will pop up a new window that allows the user to go through all of the selected audio inputs. Within this menu, users can:
- "Denoise Output" option results in cleaner results, but the processing time will be longer. This option has replaced Noise Reduction.
- "Spectral Inversion" option uses spectral inversion techniques for a cleaner secondary stem result. This option may slow down the audio export process.
- A "Pre-process" model that allows the user to run an inference through a robust vocal or instrumental model and separate the remaining stems from its generated instrumental mix. This option can significantly reduce vocal bleed in other Demucs-generated non-vocal stems.
- The Pre-process model is intended for Demucs separations for all stems except vocals and instrumentals.
- Ensemble Mode has been extended to include the following:
- "Averaging" is a new algorithm that averages the final results.
- Unlimited models in the ensemble.
- Ability to save different ensembles.
- Ability to ensemble outputs for all individual stem types.
- Ability to choose unique ensemble algorithms.
- Ability to ensemble all 4 Demucs stems at once.
## Troubleshooting
### Common Issues
- If FFmpeg is not installed, the application will throw an error if the user attempts to convert a non-WAV file.
- Memory allocation errors can usually be resolved by lowering the "Chunk Size".
### Issue Reporting
Please be as detailed as possible when posting a new issue.
If possible, click the "Settings Button" to the left of the "Start Processing" button and click the "Error Log" button for detailed error information that can be provided to us.
## License
The **Ultimate Vocal Remover GUI** code is [MIT-licensed](LICENSE).
- **Please Note:** For all third-party application developers who wish to use our models, please honor the MIT license by providing credit to UVR and its developers.
## Credits
- [DilanBoskan](https://github.com/DilanBoskan) - Your contributions at the start of this project were essential to the success of UVR. Thank you!
- [Bas Curtiz](https://www.youtube.com/user/bascurtiz) - Designed the official UVR logo, icon, banner, and splash screen.
- [tsurumeso](https://github.com/tsurumeso) - Developed the original VR Architecture code.
- [Kuielab & Woosung Choi](https://github.com/kuielab) - Developed the original MDX-Net AI code.
- [Adefossez & Demucs](https://github.com/facebookresearch/demucs) - Developed the original Demucs AI code.
- [KimberleyJSN](https://github.com/KimberleyJensen) - Advised and aided the implementation of the training scripts for MDX-Net and Demucs. Thank you!
- [Hv](https://github.com/NaJeongMo/Colab-for-MDX_B) - Helped implement chunks into the MDX-Net AI code. Thank you!
## Contributing
- For anyone interested in the ongoing development of **Ultimate Vocal Remover GUI**, please send us a pull request, and we will review it.
- This project is 100% open-source and free for anyone to use and modify as they wish.
- We only maintain the development and support for the **Ultimate Vocal Remover GUI** and the models provided.
## References
- [1] Takahashi et al., "Multi-scale Multi-band DenseNets for Audio Source Separation", https://arxiv.org/pdf/1706.09588.pdf