Evaluates the degree to which two English sentences are semantically equivalent to each other. Similarity scores range from 0 for no overlap in meaning to 5 for equivalence of meaning. Values in between reflect interpretable levels of partial overlap in meaning.
Publication
Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, and Lucia Specia. 2017. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
Language
English
NLP topic
Abstract task
Year
2017
Publication link
Ranking metric
Pearson correlation
Task results
System | Pearson correlation |
---|---|
bert-base-multilingual-cased | 0.8000 |
ixambert-base-cased | 0.8200 |
distilbert-base-multilingual-cased | 0.7600 |
xlm-roberta-base | 0.8000 |
distilbert-base-uncased | 0.8100 |
bert-base-cased | 0.8200 |
roberta-base | 0.8500 |
roberta-large | 0.8600 |
xlm-roberta-large | 0.8400 |