Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression UG edits. #3352

Closed
wants to merge 37 commits into from
Closed

Conversation

dwelsch-esi
Copy link
Contributor

Style and grammar edits of the Compression User Guide.


Spatial SVD is a tensor decomposition technique which decomposes one large layer (in terms of mac or memory) into two smaller layers. SVD stands for Singular Value Decomposition.
Spatial singular value decomposition (SSVD) is a technique that decomposes one large convolution (Conv) MAC or memory layer into two smaller layers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not abbreviate Spatial SVD to SSVD?

- ℎ is the height of the kernel
- 𝑤 is the width of the kernel

SSVD decomposes the kernel into two kernels, one of size (𝑚,𝑘,ℎ,1) and one of size (𝑘,𝑛,1,𝑤), where 𝑘 is called the `rank`. The smaller the value of 𝑘, the larger the degree of compression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. Please replace SSVD w/ Spatial SVD.


SSVD decomposes the kernel into two kernels, one of size (𝑚,𝑘,ℎ,1) and one of size (𝑘,𝑛,1,𝑤), where 𝑘 is called the `rank`. The smaller the value of 𝑘, the larger the degree of compression.

The following figure illustrates how SSVD decomposes both the output channel dimension and the size of the Conv kernel itself.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. Please replace SSVD w/ Spatial SVD.


Overview
========
The model compression methods, Spatial SVD and Channel Pruning work on per layer basis. Not all the layers in the given model are equally compressible. Compression of individual layers of a given model can have varying impact on the final accuracy of the model. Greedy Per Layer Compression Ratio Selection Algorithm is used to assess the sensitivity of applicable layers to compression and find appropriate compression-ratio for each individual layers. The algorithm makes sure that the entire model has highest remaining accuracy and also meets the given target compression-ratio.

Spatial SVD (SSVD) and channel pruning (CP) work on individual layers of a model. Not all the layers are equally compressible, so compression of a given layer has a variable impact on the final model accuracy. The greedy per-layer compression ratio selection algorithm assesses the sensitivity of layers to compression and finds an appropriate compression ratio for each layer. The algorithm ensures that the model maintains the highest possible accuracy while meeting the target compression ratio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here. replace Spatial SVD (SSVD) w/ Spatial SVD.

@quic-hitameht quic-hitameht added the documentation Improvements or additions to documentation label Oct 14, 2024
quic-ipendse and others added 18 commits October 15, 2024 09:12
…tats

* Adding boxplots for advanced stats and modifying the backend for the same
* Cleanup JavaScript callbacks by packaging repetitive code as a function in utils
* Generalize the code in callbacks by adaptively iterating over columns present in datasources instead of manually listing them
* Dynamically adjust boxplot width according to the number of boxplots
* Update docstrings

---------

Signed-off-by: Ishan Pendse <quic_ipendse@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
* Implement pyproject.toml to build AIMET package

There are 3 dynamic fields in metadata:
 - version
 - dependencies
 - name (not PEP compatible!)
A plugin of scikit-build-core build system generates dependencies
and package name based on `CMAKE_ARGS` environment variable.

Signed-off-by: Evgeny Mironov <quic_emironov@quicinc.com>

* Build all docker images from the single Dockerfile

Docker images contain dependencies to build and run tests.

Signed-off-by: Evgeny Mironov <quic_emironov@quicinc.com>

* Build docker images on a CI

Signed-off-by: Evgeny Mironov <quic_emironov@quicinc.com>

---------

Signed-off-by: Evgeny Mironov <quic_emironov@quicinc.com>
Co-authored-by: Evgeny Mironov <quic_emironov@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Raj Gite <quic_rgite@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Raj Gite <quic_rgite@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kevin Hsieh <quic_klhsieh@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
* Edited Quantization User Guide.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

* Quantization Guide - more edits.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

* Corrected TOC errors introduced by Quant UG edits.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

* Review changes PR quic#3348 - Quantization UG edits.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

* More review changes PR quic#3348 - Quantization UG edits.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

---------

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kevin Hsieh <quic_klhsieh@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Sai Chaitanya Gajula <quic_gsaichai@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
- Backend doesn't expect non-float tensors to have encodings. Compare op's
  output is a tensor of bools for which the encoding shouldn't have been
  present. This change fixes this issue by disabling the quantizer.

Signed-off-by: yathindra kota <quic_ykota@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Michael Tuttle <quic_mtuttle@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
…uic#3370)

Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
quic#3371)

Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
* Add batch norm fold support in QuantSim v2

Signed-off-by: Priyanka Dangi <quic_pdangi@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: yathindra kota <quic_ykota@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
quic-kyunggeu and others added 18 commits October 15, 2024 09:12
Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kyunggeun Lee <quic_kyunggeu@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Michael Tuttle <quic_mtuttle@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
* Remove marked_module from input and output field of onnx.GraphProto

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

* Add a unit test to verify the fix

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

---------

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Michael Tuttle <quic_mtuttle@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Chetan Gulecha <quic_cgulecha@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Michael Tuttle <quic_mtuttle@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Michael Tuttle <quic_mtuttle@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
* Add utility to tie quantizers in aimet_onnx

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

* Incorporate review feedback

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

* Expose temp API to apply constraints

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

---------

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Michael Tuttle <quic_mtuttle@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kevin Hsieh <quic_klhsieh@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
* Delete symlinks, Unpin torch version, upgrade cuda version to 11.8

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

* Add complete path to html with --find-links

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>

* Fixed unpinned torch install in onnx dockerfiles, updated ALL dockerfiles for consistency

Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>

* Pinned back numpy in onnx dev dockers, updated numpy version in reqs file

Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>

* Cleaned up onnx depndency lists

Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>

* Added missing files

Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>

* Updated reqs_deb_onnx_gpu file with cuda 11.8 versions of packages

Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>

---------

Signed-off-by: Hitarth Mehta <quic_hitameht@quicinc.com>
Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>
Co-authored-by: Hitarth Mehta <quic_hitameht@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kevin Hsieh <quic_klhsieh@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: yathindra kota <quic_ykota@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Kevin Hsieh <quic_klhsieh@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
…inks option) (quic#3397)

* Added missing find-links URL topion to torch installation instructions
* Fixed onnx wheel instructions, added common pre-requisites
* Fixed onnx-cpu wheel install instructions
* Added pip version pinning

Signed-off-by: Bharath Ramaswamy <quic_bharathr@quicinc.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Dave Welsch <116022979+dwelsch-esi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.