Distilbert for QA fine-tuned on SQuAD v1.1 with Knowledge Distillation
andrelmfarias
released this
25 Oct 09:21
·
16 commits
to master
since this release
Release with a version of DistilBERT model trained on SQuAD 1.1 using Knowledge Distillation and bert-large-uncased-whole-word-masking-finetuned-squad
as a teacher.
This version of Distilbert achieves 80.1% EM and 87.5% F1-score (vs. 81.2% EM and 88.6% F1-score for our version of BERT), while being much faster and lighter.
Version available only with sklearn wrapper.