Quantization

This repository focuses on the quantization of efficient neural network architectures using PyTorch. Quantization is a technique that allows us to reduce the memory and computational requirements of deep learning models, making them more efficient for deployment on various hardware platforms.

In this repository, we primarily target two popular architectures for quantization:

MobileNetV1
PhiNet

MobileNetV1

Reference

Howard, Andrew G and Zhu, Menglong and Chen, Bo and Kalenichenko, Dmitry and Wang, Weijun and Weyand, Tobias and Andreetto, Marco and Adam, Hartwig (2017). "Mobilenets: Efficient convolutional neural networks for mobile vision applications.". arXiv preprint arXiv:1704.04861

MobileNetV2

Reference

Sandler, Mark and Howard, Andrew and Zhu, Menglong and Zhmoginov, Andrey and Chen, Liang-Chieh (2018). "Mobilenetv2: Inverted residuals and linear bottlenecks.". Proceedings of the IEEE conference on computer vision and pattern recognition.

PhiNet

The original model, when quantized, was not efficient.

Improvement to make the model quantizable;
Reduction of inference time on GPU and CPU;

Reference

Paissan, Francesco, Alberto Ancilotto, and Elisabetta Farella (2022). "PhiNets: A Scalable Backbone for Low-Power AI at the Edge." ACM Trans. Embed. Comput. Syst.. DOI: 10.1145/3510832

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
MobileNetV1		MobileNetV1
MobileNetV2		MobileNetV2
PhiNet		PhiNet
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantization

MobileNetV1

Reference

MobileNetV2

Reference

PhiNet

Reference

About

Languages

Tremo8/Quantization

Folders and files

Latest commit

History

Repository files navigation

Quantization

MobileNetV1

Reference

MobileNetV2

Reference

PhiNet

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages