Skip to content

Distilbert for QA fine-tuned on SQuAD v1.1 with Knowledge Distillation

Compare
Choose a tag to compare
@andrelmfarias andrelmfarias released this 25 Oct 09:21
· 16 commits to master since this release
376ae3a

Release with a version of DistilBERT model trained on SQuAD 1.1 using Knowledge Distillation and bert-large-uncased-whole-word-masking-finetuned-squad as a teacher.

This version of Distilbert achieves 80.1% EM and 87.5% F1-score (vs. 81.2% EM and 88.6% F1-score for our version of BERT), while being much faster and lighter.

Version available only with sklearn wrapper.