Distilbert for QA fine-tuned on SQuAD v1.1 with Knowledge Distillation

andrelmfarias released this 25 Oct 09:21

· 16 commits to master since this release

376ae3a

Release with a version of DistilBERT model trained on SQuAD 1.1 using Knowledge Distillation and bert-large-uncased-whole-word-masking-finetuned-squad as a teacher.

This version of Distilbert achieves 80.1% EM and 87.5% F1-score (vs. 81.2% EM and 88.6% F1-score for our version of BERT), while being much faster and lighter.

Version available only with sklearn wrapper.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distilbert for QA fine-tuned on SQuAD v1.1 with Knowledge Distillation