A reproduced implementation of "Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding".
git clone https://github.com/tosiyuki/LLM-to-SLM.git
wget -P dataset/ https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json
bash scripts/t5-3b-to-gpt2.sh
python demo.py
-
This model uses A as LLM and B as SLM. Training data is Stanford Alpaca 52K and cannot be used for commercial purposes.
Bergner et al, Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding, 2024