> top > projects

Projects

NameTDescription# Ann.AuthorMaintainerUpdated_atStatus

1-20 / 116 show all
bionlp-st-ge-2016-spacy-parsedDependency parses produced by spaCy parser, and part-of-speech tags produced by Stanford tagger (with the wsj-0-18-left3words-nodistsim model). The exact procedure is described here. Data set contains the 34 full paper articles used in the BioNLP 2016 GE task. 226 KNico ColicNico Colic2016-05-25Released
bionlp-st-ge-2016-test-teesNER and event extraction produced by TEES (with the default GE11 model) for the 14 full papers used in the BioNLP 2016 GE task test corpus.9.17 KNico ColicNico Colic2016-05-25Released
bionlp-st-ge-2016-reference-teesNER and event extraction produced by TEES (with the default GE11 model) for the 20 full papers used in the BioNLP 2016 GE task reference corpus.14.6 KNico Colic Nico Colic2016-05-25Released
FSU-PRGEA new broad-coverage corpus composed of 3,306 MEDLINE abstracts dealing with gene and protein mentions. The annotation process was semi-automatic. Publication: http://aclweb.org/anthology/W/W10/W10-1838.pdf59.5 KCALBC ProjectYue Wang2017-03-08Released
spacy-testRandom set of articles used for testing in the development of the RESTful spaCy parsing web service. Since development is now finished, they are released for the community to use.137 KNico ColicNico Colic2019-03-16Released
craft-sa-devDevelopment data for CRAFT SA shared task. This project contains the development (training) annotations for the Structural Annotation task of the CRAFT Shared Task 2019. This particular set contains token and sentence annotations with tokens linked via dependency relations. These dependency relations were automatically generated using the manually curated CRAFT constituency treebank files as input.512 KUniversity of Colorado Anschutz Medical Campuscraft-st2019-03-25Released
GlyCosmos600-docsA random collection of 600 PubMed abstracts from 6 glycobiology-related journals: Glycobiology, Glycoconjugate journal, The Journal of biological chemistry, Journal of proteome research, Journal of proteomics, and Carbohydrate research. The whole PMIDs were collected on June 11, 2019. From each journal, 100 PMIDs were randomly sampled.0Jin-Dong Kim2019-06-11Released
DisGeNET5_variant_diseaseThe file contains variant-disease associations obtained by text mining MEDLINE abstracts using the BeFree system, including the variant and disease off sets. 144 KIBI GroupYue Wang2020-02-01Released
DisGeNET5_gene_diseaseThe file contains gene-disease associations obtained by text mining MEDLINE abstracts using the BeFree system including the gene and disease off sets.2.04 MIBI GroupYue Wang2020-02-02Released
GoldHamster1_PubTatorCentralPredictions from PubTator Central for the articles in the GoldHamster1 corpus.13.8 K2020-02-26Released
GoldHamster1_StructuredAbstractsSections from PubMed structured abstracts for the articles in the GoldHamster1 corpus.8242020-02-26Released
GoldHamster1_ArguminSciPredictions from ArguminSci for the abstracts in the GoldHamster1 corpus.10.2 K2020-02-27Released
GoldHamster1_CellosaurusPredictions of cell lines from Cellosaurus for the abstracts in the GoldHamster1 corpus.12.3 K2020-02-28Released
CORD-19_All_docsAll the documents in the whole CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.0Jin-Dong Kim2020-03-23Released
CORD-19_bioRxiv_medRxiv_subsetThe bioRxiv/medRxiv subset of the CORD-19 dataset: pre-prints that are not peer reviewed. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT. 0Jin-Dong Kim2020-03-23Released
CORD-19_Commercial_use_subsetThe Commercial use subset of the CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.0Jin-Dong Kim2020-03-23Released
CORD-19_Custom_license_subsetThe Custom license subset of the CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.0Jin-Dong Kim2020-03-23Released
CORD-19_Non-commercial_use_subsetThe Non commercial use subset of the CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.0Jin-Dong Kim2020-03-23Released
PubMed_ArguminSciPredictions for PubMed automatically extracted with the ArguminSci tool (https://github.com/anlausch/ArguminSci).15.3 Kzebet2020-03-31Released
LitCovid-PubTatorCentralNamed-entities for the documents in the LitCovid dataset. Annotations were automatically predicted by the PubTatorCentral tool (https://www.ncbi.nlm.nih.gov/research/pubtator/)4.64 Kzebet2020-04-01Released
NameT# Ann.AuthorMaintainerUpdated_atStatus

1-20 / 116 show all
bionlp-st-ge-2016-spacy-parsed226 KNico ColicNico Colic2016-05-25Released
bionlp-st-ge-2016-test-tees9.17 KNico ColicNico Colic2016-05-25Released
bionlp-st-ge-2016-reference-tees14.6 KNico Colic Nico Colic2016-05-25Released
FSU-PRGE59.5 KCALBC ProjectYue Wang2017-03-08Released
spacy-test137 KNico ColicNico Colic2019-03-16Released
craft-sa-dev512 KUniversity of Colorado Anschutz Medical Campuscraft-st2019-03-25Released
GlyCosmos600-docs0Jin-Dong Kim2019-06-11Released
DisGeNET5_variant_disease144 KIBI GroupYue Wang2020-02-01Released
DisGeNET5_gene_disease2.04 MIBI GroupYue Wang2020-02-02Released
GoldHamster1_PubTatorCentral13.8 K2020-02-26Released
GoldHamster1_StructuredAbstracts8242020-02-26Released
GoldHamster1_ArguminSci10.2 K2020-02-27Released
GoldHamster1_Cellosaurus12.3 K2020-02-28Released
CORD-19_All_docs0Jin-Dong Kim2020-03-23Released
CORD-19_bioRxiv_medRxiv_subset0Jin-Dong Kim2020-03-23Released
CORD-19_Commercial_use_subset0Jin-Dong Kim2020-03-23Released
CORD-19_Custom_license_subset0Jin-Dong Kim2020-03-23Released
CORD-19_Non-commercial_use_subset0Jin-Dong Kim2020-03-23Released
PubMed_ArguminSci15.3 Kzebet2020-03-31Released
LitCovid-PubTatorCentral4.64 Kzebet2020-04-01Released