What's New

PyTorch
- Enhancements done in export pipeline for GPU memory optimization with LLMs.
- [Experimental] Added support for handling of LoRA (via PEFT API) in AIMET. and enabled export of required artifacts for QNN.
- Added examples for training pipeline with for distributed KD-QAT.
- [Experimental] Added support for block wise quantization (BQ) to support w4fp16 format, and the low-power block quantization (LPBQ) to support w4a8 and w4a16 formats. This feature needs QuantSim V2.

Documentation

aimet_torch-1.33.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
aimet_torch-1.33.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
aimet_torch-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
aimet_onnx-1.33.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
aimet_onnx-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
aimet_tensorflow-1.33.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
aimet_tensorflow-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA