This is an extractive text comprehension task formulated in terms of question-answering. The task consists of answering questions about a text in such a way that the answer is a fragment extracted directly from the text. The texts are academic news from Cambridge University from several scientific domains. In all cases, the answers are fragments of the text and questions that cannot be answered from the text are not included.
Language
English
NLP topic
Abstract task
Dataset
Year
2024
Ranking metric
F1
Task results
System | Precision | Recall | F1 Sort ascending | CEM | Accuracy | MacroPrecision | MacroRecall | MacroF1 | RMSE | MicroPrecision | MicroRecall | MicroF1 | MAE | MAP | UAS | LAS | MLAS | BLEX | Pearson correlation | Spearman correlation | MeasureC | BERTScore | EMR | Exact Match | F0.5 | Hierarchical F | ICM | MeasureC | Propensity F | Reliability | Sensitivity | Sentiment Graph F1 | WAC | b2 | erde30 | sent | weighted f1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Roberta large | 0.4626 | 0.4626 | 0.4626 | 0.4626 | 0.46 | ||||||||||||||||||||||||||||||||
Xlm roberta large | 0.4163 | 0.4163 | 0.4163 | 0.4163 | 0.42 | ||||||||||||||||||||||||||||||||
Roberta base | 0.3746 | 0.3746 | 0.3746 | 0.3746 | 0.37 | ||||||||||||||||||||||||||||||||
Xlm roberta base | 0.3251 | 0.3251 | 0.3251 | 0.3251 | 0.33 | ||||||||||||||||||||||||||||||||
Ixa ehu ixambert base cased | 0.3222 | 0.3222 | 0.3222 | 0.3222 | 0.32 | ||||||||||||||||||||||||||||||||
Bert base cased | 0.2996 | 0.2996 | 0.2996 | 0.2996 | 0.30 | ||||||||||||||||||||||||||||||||
Bert base multilingual cased | 0.2948 | 0.2948 | 0.2948 | 0.2948 | 0.29 | ||||||||||||||||||||||||||||||||
Distilbert base uncased | 0.2670 | 0.2670 | 0.2670 | 0.2670 | 0.27 | ||||||||||||||||||||||||||||||||
Distilbert base multilingual cased | 0.1994 | 0.1994 | 0.1994 | 0.1994 | 0.20 |