Semantic Textual Similarity ES-ES 2017

Evaluates the degree to which two Spanish sentences are semantically equivalent to each other. Similarity scores range from 0 for no overlap in meaning to 5 for equivalence of meaning. Values in between reflect interpretable levels of partial overlap in meaning.

 

Publication
Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, and Lucia Specia. 2017. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
Language
Spanish
NLP topic
Abstract task
Year
2017
Ranking metric
Pearson correlation

Task results

System Precision Recall F1 Sort ascending CEM Accuracy MacroPrecision MacroRecall MacroF1 RMSE MicroPrecision MicroRecall MicroF1 MAE MAP UAS LAS MLAS BLEX Pearson correlation Spearman correlation MeasureC BERTScore EMR Exact Match F0.5 Hierarchical F ICM MeasureC Propensity F Reliability Sensitivity Sentiment Graph F1 WAC b2 erde30 sent weighted f1
Dccuchile bert base spanish wwm cased 0.8330 0.8330 0.8330 0.8330 0.83
Xlm roberta large 0.8287 0.8287 0.8287 0.8287 0.83
PlanTL GOB ES roberta large bne 0.8232 0.8232 0.8232 0.8232 0.82
Ixa ehu ixambert base cased 0.8120 0.8120 0.8120 0.8120 0.81
PlanTL GOB ES roberta base bne 0.8096 0.8096 0.8096 0.8096 0.81
CenIA distillbert base spanish uncased 0.7951 0.7951 0.7951 0.7951 0.80
Bert base multilingual cased 0.7920 0.7920 0.7920 0.7920 0.79
Xlm roberta base 0.7861 0.7861 0.7861 0.7861 0.79
Bertin roberta base spanish 0.7818 0.7818 0.7818 0.7818 0.78
Distilbert base multilingual cased 0.7781 0.7781 0.7781 0.7781 0.78

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.