EXIST-2023-EN

The EXIST 2023 English corpus is a collection of  tweets labelled with information related to sexism: whether the tweet is sexist, the type of  intention of the author of the tweet shows and the type of sexism that is being exerted.

Language(s)
English
Year
2023
Domain
Social
Text types
Tweets
Annotations
Binary label indicating whether a tweet expresses sexism, multiclass lables about the type of sexism and the intention of the author
Format
json
Data access
Registration

Publication
Plaza, L. et al. (2023). Overview of EXIST 2023 – Learning with Disagreement for Sexism Identification and Characterization. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, vol 14163. Springer, Cham. https://doi.org/10.1007/978-3-031-42448-9_23
NLP Topic
Number of units
4152
Type of units
Tweets
Training set size
2870
Test set size
838
Development set size
444

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.