Skip to content

Latest commit

 

History

History
115 lines (99 loc) · 4.52 KB

File metadata and controls

115 lines (99 loc) · 4.52 KB

PyTorch Transformer model Bert-base-uncased for Natural Language Classifier and Question Answering

This document describes evaluation of optimized checkpoints for transformer models Bert-base-uncased for NL Classification tasks and Question Answering tasks.

AIMET installation and setup

Please install and setup AIMET (Torch GPU variant) before proceeding further.

NOTE

  • All AIMET releases are available here: https://github.com/quic/aimet/releases
  • This model has been tested using AIMET version 1.23.0 (i.e. set release_tag="1.23.0" in the above instructions).
  • This model is compatible with the PyTorch GPU variant of AIMET (i.e. set AIMET_VARIANT="torch_gpu" in the above instructions).

Additional Setup Dependencies

pip install datasets==2.4.0
pip install transformers==4.11.3 

Model checkpoint

  • Original full precision checkpoints without downstream training were downloaded through hugging face
  • [Full precision model with downstream training weight files] are automatically downloaded using evaluation script
  • [Quantization optimized model weight files] are automatically downloaded using evaluation script

Dataset

Usage

To run evaluation with QuantSim for Natural Language Classifier tasks in AIMET, use the following

python transformers_nlclassifier_quanteval.py \
        --model_name_or_path <MODEL_NAME> \
        --task_name <TASK> \
        --per_device_eval_batch_size 4 \
        --output_dir <OUT_DIR> \
  • example

    python transformers_nlclassifier_quanteval.py --model_name_or_path bert-base-uncased   --task_name rte  --per_device_eval_batch_size 4 --output_dir ./evaluation_result 
    
  • supported keyword of task_name supported are "rte","stsb","mrpc","cola","sst2","qnli","qqp","mnli"

  • supported model_name_or_path are "bert-base-uncased", "google/mobilebert-uncased", "microsoft/MiniLM-L12-H384-uncased", "distilbert-base-uncased", "roberta-base"

To run evaluation with QuantSim for Question Answering tasks in AIMET, use the following

python transformers_qa_quanteval.py \
    --model_name_or_path <MODEL_NAME> \
    --dataset_name <DATASET_NAME> \
    --per_device_eval_batch_size 4 \
    --output_dir <OUT_DIR>
  • example

    python transformers_qa_quanteval.py --model_name_or_path bert-base-uncased --dataset_name squad  --per_device_eval_batch_size 4 --output_dir ./evaluation_result 
    
  • supported model_name_or_path are "bert-base-uncased", "google/mobilebert-uncased", "microsoft/MiniLM-L12-H384-uncased", "distilbert-base-uncased", "roberta-base"

  • supported dataset_name is "squad"

Quantization Configuration

The following configuration has been used for the above models for INT8 quantization:

Results

Below are the results of the Pytorch transformer model Bert for GLUE dataset:

Configuration CoLA (corr) SST-2 (acc) MRPC (f1) STS-B (corr) QQP (acc) MNLI (acc) QNLI (acc) RTE (acc) GLUE
FP32 58.76 93.12 89.93 88.84 90.94 85.19 91.63 66.43 83.11
W8A8 56.93 91.28 90.34 89.13 90.78 81.68 91.14 68.23 82.44