Multilingual Complex Named Entity Recognition 2022

The task consists in detecting and labeling  semantically ambiguous and complex entities in short and low-context settings. Complex NEs, like the titles of creative works (movie/book/song/software names) are not simple nouns and are harder to recognize. They can take the form of any linguistic constituent, like an imperative clause (“Dial M for Murder”), and do not look like traditional NEs (Person names, locations, organizations).

The task is performed on the MULTICONER dataset (Malmasi et al., 2022). MULTICONER provides data from three domains (Wikipedia sentences, questions, and search queries) across 11 different languages, which are used to define 11 monolingual subsets of the shared task. Additionally, the dataset has multilingual and code-mixed subsets.


The following named entities are tagged: names of people, location or physical facilities, corporations and businesses, all other groups, consumer products, titles of creative works like movie, song, and book titles.

Publication
Shervin Malmasi, Anjie Fang, Besnik Fetahu, Sudipta Kar, and Oleg Rokhlenko. 2022. SemEval-2022 Task 11: Multilingual Complex Named Entity Recognition (MultiCoNER). In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1412–1437, Seattle, United States. Association for Computational Linguistics.
Language
Spanish
Abstract task
Year
2022
Ranking metric
F1

Task results

System Precision Recall F1 Sort ascending CEM Accuracy MacroPrecision MacroRecall MacroF1 RMSE MicroPrecision MicroRecall MicroF1 MAE MAP UAS LAS MLAS BLEX Pearson correlation Spearman correlation MeasureC BERTScore EMR Exact Match F0.5 Hierarchical F ICM MeasureC Propensity F Reliability Sensitivity Sentiment Graph F1 WAC b2 erde30 sent weighted f1
Xlm roberta large 0.6801 0.6801 0.6801 0.6801 0.68
Xlm roberta base 0.6201 0.6201 0.6201 0.6201 0.62
PlanTL GOB ES roberta large bne 0.6069 0.6069 0.6069 0.6069 0.61
PlanTL GOB ES roberta base bne 0.6041 0.6041 0.6041 0.6041 0.60
Bert base multilingual cased 0.5992 0.5992 0.5992 0.5992 0.60
Ixa ehu ixambert base cased 0.5926 0.5926 0.5926 0.5926 0.59
CenIA distillbert base spanish uncased 0.5894 0.5894 0.5894 0.5894 0.59
Distilbert base multilingual cased 0.5580 0.5580 0.5580 0.5580 0.56
Dccuchile bert base spanish wwm cased 0.5472 0.5472 0.5472 0.5472 0.55
Bertin roberta base spanish 0.5215 0.5215 0.5215 0.5215 0.52

If you have published a result better than those on the list, send a message to odesia-comunicacion@lsi.uned.es indicating the result and the DOI of the article, along with a copy of it if it is not published openly.