Evaluates the degree to which two English sentences are semantically equivalent to each other. Similarity scores range from 0 for no overlap in meaning to 5 for equivalence of meaning. Values in between reflect interpretable levels of partial overlap in meaning.
Publication
Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, and Lucia Specia. 2017. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
Language
English
NLP topic
Abstract task
Year
2017
Publication link
Ranking metric
Pearson correlation
Task results
System | Precision | Recall | F1 Sort ascending | CEM | Accuracy | MacroPrecision | MacroRecall | MacroF1 | RMSE | MicroPrecision | MicroRecall | MicroF1 | MAE | MAP | UAS | LAS | MLAS | BLEX | Pearson correlation | Spearman correlation | MeasureC | BERTScore | EMR | Exact Match | F0.5 | Hierarchical F | ICM | MeasureC | Propensity F | Reliability | Sensitivity | Sentiment Graph F1 | WAC | b2 | erde30 | sent | weighted f1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Roberta large | 0.8656 | 0.8656 | 0.8656 | 0.8656 | 0.87 | ||||||||||||||||||||||||||||||||
Roberta base | 0.8572 | 0.8572 | 0.8572 | 0.8572 | 0.86 | ||||||||||||||||||||||||||||||||
Xlm roberta large | 0.8450 | 0.8450 | 0.8450 | 0.8450 | 0.84 | ||||||||||||||||||||||||||||||||
Bert base cased | 0.8434 | 0.8434 | 0.8434 | 0.8434 | 0.84 | ||||||||||||||||||||||||||||||||
Distilbert base uncased | 0.8360 | 0.8360 | 0.8360 | 0.8360 | 0.84 | ||||||||||||||||||||||||||||||||
Ixa ehu ixambert base cased | 0.8170 | 0.8170 | 0.8170 | 0.8170 | 0.79 | ||||||||||||||||||||||||||||||||
Bert base multilingual cased | 0.8112 | 0.8112 | 0.8112 | 0.8112 | 0.81 | ||||||||||||||||||||||||||||||||
Xlm roberta base | 0.8097 | 0.8097 | 0.8097 | 0.8097 | 0.81 | ||||||||||||||||||||||||||||||||
Distilbert base multilingual cased | 0.7872 | 0.7872 | 0.7872 | 0.7872 | 0.79 |