Skip to content

Latest commit

 

History

History
95 lines (85 loc) · 3.79 KB

File metadata and controls

95 lines (85 loc) · 3.79 KB

PyTorch Transformer model Roberta-base for Natural Language Classifier

This document describes evaluation of optimized checkpoints for transformer models Roberta-base for NL Classification tasks.

AIMET installation and setup

Please install and setup AIMET (Torch GPU variant) before proceeding further.

NOTE

  • All AIMET releases are available here: https://github.com/quic/aimet/releases
  • This model has been tested using AIMET version 1.23.0 (i.e. set release_tag="1.23.0" in the above instructions).
  • This model is compatible with the PyTorch GPU variant of AIMET (i.e. set AIMET_VARIANT="torch_gpu" in the above instructions).

Additional Setup Dependencies

pip install datasets==2.4.0
pip install transformers==4.11.3 

Model checkpoint

  • Original full precision checkpoints without downstream training were downloaded through hugging face
  • [Full precision model with downstream training weight files] are automatically downloaded using evaluation script
  • [Quantization optimized model weight files] are automatically downloaded using evaluation script

Dataset

Usage

To run evaluation with QuantSim for Natural Language Classifier tasks in AIMET, use the following

python transformers_nlclassifier_quanteval.py \
        --model_name_or_path <MODEL_NAME> \
        --task_name <TASK> \
        --per_device_eval_batch_size 4 \
        --output_dir <OUT_DIR> \
  • example

    python transformers_nlclassifier_quanteval.py --model_name_or_path roberta-base   --task_name rte  --per_device_eval_batch_size 4 --output_dir ./evaluation_result 
    
  • supported keyword of task_name supported are "rte","stsb","mrpc","cola","sst2","qnli","qqp","mnli"

  • supported model_name_or_path are "bert-base-uncased", "google/mobilebert-uncased", "microsoft/Roberta-base", "distilbert-base-uncased", "roberta-base"

Quantization Configuration

The following configuration has been used for the above models for INT8 quantization:

Results

Below are the results of the Pytorch transformer model Roberta for GLUE dataset:

Configuration CoLA (corr) SST-2 (acc) MRPC (f1) STS-B (corr) QQP (acc) MNLI (acc) QNLI (acc) RTE (acc) GLUE
FP32 60.36 94.72 91.84 90.54 91.24 87.29 92.33 72.56 85.11
W8A8 57.35 92.55 92.69 90.15 90.09 86.88 91.47 72.92 84.26