NGLY1-deficiency | | A collection of PubMed abstracts that may be related to NGLY1 deficiency. | 60.5 K | | Jin-Dong Kim | 2023-11-29 | Developing | |
Preeclampsia-compare | | | 67.2 K | | Jin-Dong Kim | 2023-11-29 | Testing | |
CORD-PICO | | Automatic annotation of the CORD-19 dataset with PICO categories. The corpus was automatically labeled with an LSTM-CRF model trained on human-annotated PubMed abstracts from https://github.com/bepnye/EBM-NLP. Currently, titles and abstracts only are annotated using Population, Intervention and Outcome labels, as well as more fine-grained labels such as Age, Drug, Mortality and others. | 69.6 K | Simon Suster | ssuster | 2023-11-27 | Developing | |
GlyCosmos6-Glycan-Motif-Image | | | 87.8 K | | Jin-Dong Kim | 2023-11-24 | Developing | |
UCDIT_TEST | | colitis link | 91.6 K | | alo33 | 2023-11-27 | Testing | |
GlycoBiology-FMA | | FMA ontology-based annotation to GlycoBiology abstracts | 96.3 K | | Jin-Dong Kim | 2023-11-29 | Testing | |
oger-json-test | | Test corpus for testing OGER web service | 97.6 K | | Nico Colic | 2023-11-29 | Testing | |
GlyCosmos6-Glycan-Motif-Structure | | Automatic annotation by Covid-19_Glycan-Motif. | 107 K | | Jin-Dong Kim | 2023-11-24 | Developing | |
LitCovid-PAS-Enju | | Predicate-argument structure annotation produced by the Enju parser. | 125 K | | Jin-Dong Kim | 2023-11-28 | Beta | |
spacy-test | | Random set of articles used for testing in the development of the RESTful spaCy parsing web service. Since development is now finished, they are released for the community to use. | 131 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
DisGeNET5_variant_disease | | The file contains variant-disease associations obtained by text mining MEDLINE abstracts using the BeFree system, including the variant and disease off sets. | 144 K | IBI Group | Yue Wang | 2023-11-24 | Released | |
PubTator4TogoVar | | | 198 K | PubTator | Yasunori Yamamoto | 2024-01-10 | Developing | |
pmc-enju-pas | | Predicate-argument structure annotation produced by Enju.
This data set is initially produced as a supporting resource for BioNLP-ST 2016 GE task.
As so, it currently includes the 34 full paper articles that are in the benchmark data sets of GE 2016 task, reference data set (bionlp-st-ge-2016-reference) and test data set (bionlp-st-ge-2016-test), but will be extended to include more papers from the PubMed Central Open Access subset (PMCOA).
| 205 K | DBCLS | Jin-Dong Kim | 2023-11-28 | Developing | |
bionlp-st-ge-2016-spacy-parsed | | Dependency parses produced by spaCy parser, and part-of-speech tags produced by Stanford tagger (with the wsj-0-18-left3words-nodistsim model). The exact procedure is described here. Data set contains the 34 full paper articles used in the BioNLP 2016 GE task.
| 225 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
mondo_disease | | annotation for diseases and disorders as defined in MONDO.
Automatic annotation by PD-MONDO. | 256 K | | Jin-Dong Kim | 2023-11-28 | Developing | |
Glycosmos6-MAT | | Automatic annotation by PD-MAT. | 263 K | | Jin-Dong Kim | 2023-11-29 | Developing | |
LitCovid-OGER-BB | | Using OGER (www.ontogene.com) and Biobert to obtain annotations for 10 different vocabularies. | 308 K | Fabio Rinaldi | Nico Colic | 2023-11-28 | Released | |
LitCovid-PD-GO-BP | | Terms for biological prosesses, as defined in GO | 374 K | | Jin-Dong Kim | 2023-11-29 | Developing | |
OryzaGP_2021_FLAIR | | | 386 K | | larmande | 2023-11-29 | Developing | |
GlyTouCan-IUPAC | | | 399 K | | kiyoko | 2023-11-29 | Testing | |