PyTorch DeepSpeech

Setup AI Model Efficiency Toolkit (AIMET)

Please install and setup AIMET before proceeding further. This model was tested with the torch_gpu variant of AIMET 1.22.2.

Clone the SeanNaren DeepSpeech2 Repo
git clone https://github.com/SeanNaren/deepspeech.pytorch.git
checkout this commit id:
cd deepspeech.pytorch
git checkout 78f7fb791f42c44c8a46f10e79adad796399892b
Append the repo locations to your PYTHONPATH with the following:

export PYTHONPATH=$PYTHONPATH:<path to parent>/deepspeech.pytorch
export PYTHONPATH=$PYTHONPATH:<path to parent>/aimet-model-zoo

Install requirements :
pip install -r aimet_zoo_torch/deepspeech2/evaluators/requirements.txt

The evaluation script will automatically download the model checkpoint from here.
Run the command below to download the dataset and format the csv needed for the test-manifest flag.

python3 deepspeech.pytorch/data/librispeech.py --files-to-use test-clean.tar.gz

python deepspeech2_quanteval.py \
  --test-manifest=<path to test manifest csv>

In the evaluation script included, we have manually configured the quantizer ops with the following assumptions:

Weight quantization: 8 bits, per tensor asymmetric quantization
Bias parameters are not quantized
Model inputs are quantized
Activation quantization: 8 bits, asymmetric quantization
- Inputs to Conv layers are quantized
- Input and recurrent activations for LSTM layers are quantized
Quantization scheme is tf enhanced
Operations which shuffle data such as reshape or transpose do not require additional quantizers