The corpus contains abstracts of scientific articles from Elsevier journals belonging to the biomedical domain. Specifically, the texts were collected between 2017 and 2018. The corpus is provided in two partitions, a training and an evaluation partition. The training partition contains 500 texts. These texts correspond to the training and evaluation partitions made public for the DIANN competition at IberLEF 2018. In addition, a private test partition containing 100 texts is provided. Since this is the partition used to evaluate systems on the ODESIA Leaderboard, this partition will not be made public. All disabilities mentioned in the texts have been annotated in the corpus.
Language(s)
          English
              Year
              2023
          Domain
          Health
              Text types
          Abstracts scientific articles
              Format
          json
              NLP Topic
          
      Number of units
              600
          Type of units
              Documents
          Tokens
              108412
          Documents
              600
          Training set size
              500
          Test set size
              100
          
