Leaderboardcx ODESIA v1 - Resultados
Odesia Core Tasks
Tareas | Spanish baseline | Mejor resultado en Español | Baseline Inglés | Mejor resultado en Inglés | Gap |
---|---|---|---|---|---|
EXIST 2022: Sexism detection (ES) | 0.69 | 0.77 | 0.67 | 0.81 | 17% |
EXIST 2022: Sexism categorisation (ES) | 0.46 | 0.57 | 0.44 | 0.58 | 10% |
DIPROMATS 2023: Propaganda identification (ES) | 0.75 | 0.82 | 0.71 | 0.82 | 11% |
DIPROMATS 2023: Coarse propaganda characterization (ES) | 0.22 | 0.47 | 0.21 | 0.55 | 48% |
DIPROMATS 2023: Fine-grained propaganda characterization (ES) | 0.09 | 0.26 | 0.08 | 0.47 | 299% |
DIANN 2023: Disability detection (ES) | 0.75 | 0.84 | 0.67 | 0.79 | 1% |
# | Sistema | Media aritmética | EXIST 2022: Sexism detection (ES) | EXIST 2022: Sexism categorisation (ES) | DIPROMATS 2023: Propaganda identification (ES) | DIPROMATS 2023: Coarse propaganda characterization (ES) | DIPROMATS 2023: Fine-grained propaganda characterization (ES) | DIANN 2023: Disability detection (ES) |
---|---|---|---|---|---|---|---|---|
1 | bertin-roberta-base-spanish | 0.528 | 0.73 | 0.49 | 0.76 | 0.36 | 0.08 | 0.75 |
2 | distillbert-base-spanish-uncased | 0.527 | 0.72 | 0.51 | 0.77 | 0.34 | 0.07 | 0.75 |
3 | PlanTL-GOB-ES-roberta-base-bne | 0.567 | 0.74 | 0.56 | 0.81 | 0.42 | 0.12 | 0.75 |
4 | distilbert-base-multilingual-cased | 0.525 | 0.72 | 0.47 | 0.75 | 0.34 | 0.09 | 0.78 |
5 | bert-base-spanish-wwm-cased | 0.573 | 0.72 | 0.54 | 0.79 | 0.44 | 0.14 | 0.81 |
6 | PlanTL-GOB-ES-roberta-large-bne | 0.607 | 0.75 | 0.57 | 0.82 | 0.44 | 0.24 | 0.82 |
7 | ixambert-base-cased | 0.530 | 0.71 | 0.49 | 0.77 | 0.32 | 0.06 | 0.83 |
8 | bert-base-multilingual-cased | 0.543 | 0.72 | 0.47 | 0.78 | 0.35 | 0.10 | 0.84 |
9 | xlm-roberta-base | 0.573 | 0.74 | 0.50 | 0.79 | 0.47 | 0.10 | 0.84 |
10 | xlm-roberta-large | 0.620 | 0.77 | 0.56 | 0.82 | 0.47 | 0.26 | 0.84 |
# | Sistema | Media aritmética | EXIST 2022: Sexism detection (EN) | EXIST 2022: Sexism categorisation (EN) | DIANN 2023: Disability detection (EN) | DIPROMATS 2023: Propaganda identification (EN) | DIPROMATS 2023: Coarse propaganda characterization (EN) | DIPROMATS 2023: Fine-grained propaganda characterization (EN) |
---|---|---|---|---|---|---|---|---|
1 | ixambert-base-cased | 0.570 | 0.75 | 0.53 | 0.73 | 0.78 | 0.49 | 0.14 |
2 | distilbert-base-uncased | 0.562 | 0.77 | 0.55 | 0.66 | 0.78 | 0.47 | 0.14 |
3 | distilbert-base-multilingual-cased | 0.555 | 0.74 | 0.53 | 0.68 | 0.77 | 0.45 | 0.16 |
4 | bert-base-cased | 0.588 | 0.76 | 0.53 | 0.72 | 0.81 | 0.50 | 0.21 |
5 | bert-base-multilingual-cased | 0.575 | 0.76 | 0.50 | 0.73 | 0.80 | 0.48 | 0.18 |
6 | roberta-base | 0.597 | 0.78 | 0.53 | 0.75 | 0.81 | 0.52 | 0.19 |
7 | xlm-roberta-base | 0.592 | 0.76 | 0.53 | 0.76 | 0.80 | 0.54 | 0.16 |
8 | roberta-large | 0.670 | 0.81 | 0.58 | 0.79 | 0.82 | 0.55 | 0.47 |
9 | xlm-roberta-large | 0.642 | 0.79 | 0.56 | 0.78 | 0.81 | 0.52 | 0.39 |
Tareas Extended ODESIA
Tareas | Spanish baseline | Mejor resultado en Español | Baseline Inglés | Mejor resultado en Inglés | Gap |
---|---|---|---|---|---|
MLDOC 2018: Document classification (ES) | 0.93 | 0.96 | 0.88 | 0.98 | 40% |
Multilingual Complex Named Entity Recognition 2022 (ES) | 0.52 | 0.71 | 0.55 | 0.75 | 5% |
SQAC-SQUAD 2016: Question answering (ES) | 0.53 | 0.77 | 0.52 | 0.88 | 25% |
Semantic Textual Similarity 2017 (ES) | 0.68 | 0.81 | 0.70 | 0.86 | 13% |
# | Sistema | Media aritmética | MLDOC 2018: Document classification (ES) | Multilingual Complex Named Entity Recognition 2022 (ES) | SQAC-SQUAD 2016: Question answering (ES) | Semantic Textual Similarity 2017 (ES) |
---|---|---|---|---|---|---|
1 | ixambert-base-cased | 0.778 | 0.96 | 0.63 | 0.71 | 0.81 |
2 | bertin-roberta-base-spanish | 0.745 | 0.96 | 0.62 | 0.73 | 0.67 |
3 | distilbert-base-multilingual-cased | 0.698 | 0.94 | 0.61 | 0.55 | 0.69 |
4 | bert-base-multilingual-cased | 0.753 | 0.96 | 0.64 | 0.71 | 0.70 |
5 | xlm-roberta-base | 0.753 | 0.95 | 0.66 | 0.67 | 0.73 |
6 | distillbert-base-spanish-uncased | 0.710 | 0.96 | 0.61 | 0.53 | 0.74 |
7 | PlanTL-GOB-ES-roberta-base-bne | 0.773 | 0.96 | 0.64 | 0.74 | 0.75 |
8 | PlanTL-GOB-ES-roberta-large-bne | 0.780 | 0.96 | 0.63 | 0.77 | 0.76 |
9 | bert-base-spanish-wwm-cased | 0.773 | 0.96 | 0.63 | 0.71 | 0.79 |
10 | xlm-roberta-large | 0.810 | 0.96 | 0.71 | 0.77 | 0.80 |
# | Sistema | Media aritmética | MLDOC 2018: Document classification (EN) | Multilingual Complex Named Entity Recognition 2022 (EN) | SQAC-SQUAD 2016: Question answering (EN) | Semantic Textual Similarity 2017 (EN) |
---|---|---|---|---|---|---|
1 | bert-base-multilingual-cased | 0.813 | 0.97 | 0.67 | 0.81 | 0.80 |
2 | ixambert-base-cased | 0.813 | 0.98 | 0.65 | 0.80 | 0.82 |
3 | distilbert-base-multilingual-cased | 0.778 | 0.97 | 0.63 | 0.75 | 0.76 |
4 | xlm-roberta-base | 0.818 | 0.98 | 0.69 | 0.80 | 0.80 |
5 | distilbert-base-uncased | 0.805 | 0.97 | 0.67 | 0.77 | 0.81 |
6 | bert-base-cased | 0.813 | 0.97 | 0.68 | 0.78 | 0.82 |
7 | roberta-base | 0.845 | 0.98 | 0.70 | 0.85 | 0.85 |
8 | roberta-large | 0.868 | 0.98 | 0.75 | 0.88 | 0.86 |
9 | xlm-roberta-large | 0.855 | 0.98 | 0.74 | 0.86 | 0.84 |