Skip to content

Latest commit

 

History

History
8 lines (6 loc) · 2.13 KB

README.md

File metadata and controls

8 lines (6 loc) · 2.13 KB

Data

Datasets:

  • training.csv: based on the MED dataset and HELP dataset, converted to fit our schema (see sarn/convert/med.py and sarn/convert/help.py) and then concatenated
  • evaluation.csv: based on the SuperGlue diagnostics dataset filtered by the Logic categories "Quantification" and "Monotonicity" and the FraCaS problem set filtered by the category "1 GENERALIZED QUANTIFIERS", converted to fit our schema (see sarn/convert/superglue.py and sarn/convert/fracas.py) and then concatenated
  • training-adj.csv: based on the MED dataset, adjectives that occur in premise and hypothesis are replaced with their WordNet opposites once on both sides, only one type of adjective is replaced at a time, and only on one side at a time, thus generating multipl output variants of the same input pair (see sarn/convert/med_adjectives.py and sarn/adjectives.py). As we were not able to automatically generate labels using MonaLog or ccg2lambda (both would always return "unknown", no matter which sequence pair we provided), we had to label the entire dataset by hand. Therefore, we decided to reduce the size of the dataset to 1200 rows. Additionally, we took this opportunity to correct some entries by hand where it makes sense, e.g., to get an entailment from an opposite adjective instead of a contradiction. We also added the six examples from FraCas 5.3 "Opposites" (see sarn/convert/fracas_adjectives.py).
  • evaluation-adj.csv: based on our evaluation dataset evaluation.csv, adjectives are replaced with WordNet, results labelled by hand, so same procedure as for training-adj.csv (see sarn/convert/evaluation_adjectives.py and sarn/adjectives.py), although we didn't have to reduce the dataset size because it only contained 144 rows.