craft-sa-dev | | Development data for CRAFT SA shared task. This project contains the development (training) annotations for the Structural Annotation task of the CRAFT Shared Task 2019. This particular set contains token and sentence annotations with tokens linked via dependency relations. These dependency relations were automatically generated using the manually curated CRAFT constituency treebank files as input. | 490 K | University of Colorado Anschutz Medical Campus | craft-st | 2023-11-27 | Released | |
performance-test | | a project for performance test | 480 K | | Jin-Dong Kim | 2023-11-27 | Testing | |
Oryza-OGER | | | 462 K | | fabiorinaldi | 2023-11-29 | | |
LitCovid-TimeML | | | 426 K | | Jin-Dong Kim | 2023-11-29 | Developing | |
GlyTouCan-IUPAC | | | 399 K | | kiyoko | 2023-11-29 | Testing | |
OryzaGP_2021_FLAIR | | | 386 K | | larmande | 2023-11-29 | Developing | |
LitCovid-PD-GO-BP | | Terms for biological prosesses, as defined in GO | 374 K | | Jin-Dong Kim | 2023-11-29 | Developing | |
21k_plant_trait_mention | | | 333 K | xzyao | | 2023-11-29 | Testing | |
LitCovid-OGER-BB | | Using OGER (www.ontogene.com) and Biobert to obtain annotations for 10 different vocabularies. | 308 K | Fabio Rinaldi | Nico Colic | 2023-11-28 | Released | |
GoldHamster | | | 285 K | | zebet | 2023-11-29 | Beta | |
Glycosmos6-MAT | | Automatic annotation by PD-MAT. | 263 K | | Jin-Dong Kim | 2023-11-29 | Developing | |
bionlp-st-ge-2016-spacy-parsed | | Dependency parses produced by spaCy parser, and part-of-speech tags produced by Stanford tagger (with the wsj-0-18-left3words-nodistsim model). The exact procedure is described here. Data set contains the 34 full paper articles used in the BioNLP 2016 GE task.
| 225 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
OryzaGP_2021_v2 | | OryzaGP_2021_v2 will use a second annotator | 208 K | | larmande | 2023-11-29 | Developing | |
pmc-enju-pas | | Predicate-argument structure annotation produced by Enju.
This data set is initially produced as a supporting resource for BioNLP-ST 2016 GE task.
As so, it currently includes the 34 full paper articles that are in the benchmark data sets of GE 2016 task, reference data set (bionlp-st-ge-2016-reference) and test data set (bionlp-st-ge-2016-test), but will be extended to include more papers from the PubMed Central Open Access subset (PMCOA).
| 205 K | DBCLS | Jin-Dong Kim | 2023-11-28 | Developing | |
PubTator4TogoVar | | | 198 K | PubTator | Yasunori Yamamoto | 2024-01-10 | Developing | |
DisGeNET5_variant_disease | | The file contains variant-disease associations obtained by text mining MEDLINE abstracts using the BeFree system, including the variant and disease off sets. | 144 K | IBI Group | Yue Wang | 2023-11-24 | Released | |
spacy-test | | Random set of articles used for testing in the development of the RESTful spaCy parsing web service. Since development is now finished, they are released for the community to use. | 131 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
PubMed_Structured_Abstracts | | Sections (zones) as retrieved from PubMed. | 131 K | | zebet | 2023-11-28 | Released | |
LitCovid-PAS-Enju | | Predicate-argument structure annotation produced by the Enju parser. | 125 K | | Jin-Dong Kim | 2023-11-28 | Beta | |
GlyCosmos6-Glycan-Motif-Structure | | Automatic annotation by Covid-19_Glycan-Motif. | 107 K | | Jin-Dong Kim | 2024-07-25 | Developing | |