Release Support all Types of Languages · DigitalPhonetics/IMS-Toucan

This release extends the toolkits functionality and provides new checkpoints.

New Features:

support for all phonemes in the IPA standard through an extended lookup of articulatory features
support for some suprasegmental markers in the IPA standard through parsing (tone, lengthening, primary stress)
praat-parselmouth for greatly improved pitch extraction
faster phonemizaton
word boundaries are added, which are invisible to the aligner and the decoder, but can help the encoder in multilingual scenarios
tonal languages added, tested and included into the pretraining (Chinese, Vietnamese)
Scorer class to inspect data given a trained model and dataset cache (provided pretrained models can be used for this)
intuitive controls for scaling durations and variance in pitch and energy
divese bugfixes and speed increases

Note:

This release breaks backwards compatibility. Make sure you are using the associated pretrained models. Old checkpoints and dataset caches become incompatible. Only HiFiGAN remains compatible.
Work on upcoming releases is already in progress. Improved voice adaptation will be our next goal.
To use the pretrained checkpoints, download them, create their corresponding directories and place them into your clone as follows (you have to rename the HiFiGAN and FastSpeech2 checkpoints once in place):

...
Models
└─ Aligner
      └─ aligner.pt
└─ FastSpeech2_Meta
      └─ best.pt
└─ HiFiGAN_combined
      └─ best.pt
...

Provide feedback