Controllable Speakers
Flux9665
released this
25 Oct 15:16
·
22 commits
to ControllableMultilingual
since this release
This release extends the toolkits functionality and provides new checkpoints.
- self contained embeddings: we no longer use an external embedding model for TTS conditioning. Instead we train one that is specifically tailored for this use.
- new vocoder: Avocodo replaces HiFi-GAN
- new controllability options through artificial speaker generation
- quality of life changes, such as weights&biases integration, a graphic demo script and automated model downloading
- divese bugfixes and speed increases
This release breaks backwards compatibility, please download the new models or stick to a prior release if you rely on your old models.