> top > projects

Projects

NameTDescription# Ann.AuthorMaintainerUpdated_atStatus

41-60 / 593 show all
bionlp-st-2016-SeeDev-training Entities and event annotations from the training set of the BioNLP-ST 2016 SeeDev task. SeeDev task focuses on seed storage and reserve accumulation on the model organism, Arabidopsis thaliana. The SeeDev task is based on the knowledge model Gene Regulation Network for Arabidopsis (GRNA) that meets the needs of text-mining (i.e. manual annotation of texts and automatic information extraction), experimental data indexing and retrieval and reuse in other plant systems. It is also expected to meet the requirements of the integration of the text knowledge with knowledge derived from experimental data in view of modeling in systems biology. GRNA model defines 16 different types of entities, and 22 types of event (in five sets of event types) that may be combined in complex events. For more information, please refer to the task website All annotations : Train set Development set Test set (without events) 35EstelleChaix2023-11-28Released
SPECIES800 SPECIES 800 (S800): an abstract-based manually annotated corpus. S800 comprises 800 PubMed abstracts in which organism mentions were identified and mapped to the corresponding NCBI Taxonomy identifiers. Described in: The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text. Pafilis E, Frankild SP, Fanini L, Faulwetter S, Pavloudi C, et al. (2013). PLoS ONE, 2013, 8(6): e65390. doi:10.1371/journal.pone.00653903.71 KEvangelos Pafilis, Sune P. Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Aikaterini Vasileiadou, Christos Arvanitidis, Lars Juhl Jensenevangelos2023-11-28Released
PubMed_Structured_Abstracts Sections (zones) as retrieved from PubMed.131 Kzebet2023-11-28Released
2015-BEL-Sample-2 The 295 BEL statements for sample set used for the 2015 BioCreative challenge.11.4 KFabio RinaldiNico Colic2023-11-28Released
SMAFIRA_Feedback_Research_Goal 15zebet2023-11-28Released
LitCovid-OGER-BB Using OGER (www.ontogene.com) and Biobert to obtain annotations for 10 different vocabularies.308 KFabio RinaldiNico Colic2023-11-28Released
SCAI-Test A small corpus for the evaluation of dictionaries containing chemical entities. Publication: http://www.scai.fraunhofer.de/fileadmin/images/bio/data_mining/paper/kolarik2008.pdf Original source: https://www.scai.fraunhofer.de/en/business-research-areas/bioinformatics/downloads/corpora-for-chemical-entity-recognition.html1.21 KCALBC ProjectYue Wang2023-11-28Released
CellFinder CellFinder corpus4.75 KMariana Neves, Alexander Damaschun, Andreas Kurtz, Ulf LeserMariana Neves2023-11-27Released
123123123 123123123150yaoxinzhi2023-11-27Released
PIR-corpus1 The Protein Information Resource (PIR) is not biased towards any particular biomedical domain, and is expected to provide more diverse protein names in a given sample size. Annotation category: protein, compound-protein, acronym.4.44 KUniversity of Delaware and Georgetown University Medical CenterYue Wang2023-11-27Released
LitCovid-sentences-v1 Sentence segmentation of all the texts in the LitCovid literature. The segmentation is automatically obtained using the TextSentencer annotation service developed and maintained by DBCLS.16.5 KJin-Dong Kim2023-11-27Released
Zoonoses_partialAnnotation This is a part of Zoonoses project used by PanZoora. But Zoonoses project provides whole manual annotated data but this is partial ones.266AikoHIRAKI2023-11-27Released
bionlp-st-ge-2016-test-proteins Protein annotations to the benchmark test data set of the BioNLP-ST 2016 GE task. A participant of the GE task may import the documents and annotations of this project to his/her own project, to begin with producing event annotations. For more details, please refer to the benchmark test data set (bionlp-st-ge-2016-test). 4.34 KDBCLSJin-Dong Kim2023-11-27Released
bionlp-st-pc-2013-training The training dataset from the pathway curation (PC) task in the BioNLP Shared Task 2013. The entity types defined in the PC task are simple chemical, gene or gene product, complex and cellular component.7.86 KNaCTeM and KISTIYue Wang2023-11-27Released
craft-sa-dev Development data for CRAFT SA shared task. This project contains the development (training) annotations for the Structural Annotation task of the CRAFT Shared Task 2019. This particular set contains token and sentence annotations with tokens linked via dependency relations. These dependency relations were automatically generated using the manually curated CRAFT constituency treebank files as input.490 KUniversity of Colorado Anschutz Medical Campuscraft-st2023-11-27Released
LitCovid-PubTatorCentral Named-entities for the documents in the LitCovid dataset. Annotations were automatically predicted by the PubTatorCentral tool (https://www.ncbi.nlm.nih.gov/research/pubtator/)4.64 Kzebet2023-11-27Released
LitCovid-ArguminSci Discourse elements for the documents in the LitCovid dataset. Annotations were automatically predicted by the ArguminSci tool (https://github.com/anlausch/ArguminSci)4.9 Kzebet2023-11-27Released
LitCovid-PD-FMA-UBERON-v1 PubDictionaries annotation for anatomy terms - updated at 2020-04-20 Disease term annotation based on FMA and Uberon. Version 2020-04-20. The terms in FMA and Uberon are loaded in PubDictionaries (FMA and Uberon), with which the annotations in this project are produced. The parameter configuration used for this project is here for FMA and there for Uberon. Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved.4.3 KJin-Dong Kim2023-11-27Released
CORD-19-PD-MONDO PubDictionaries annotation for MONDO terms - updated at 2020-04-30 It is disease term annotation based on MONDO. Version 2020-04-20. The terms in MONDO are loaded in PubDictionaries, with which the annotations in this project are produced. The parameter configuration used for this project is here. Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved.6.32 MJin-Dong Kim2023-11-27Released
FSU-PRGE A new broad-coverage corpus composed of 3,306 MEDLINE abstracts dealing with gene and protein mentions. The annotation process was semi-automatic. Publication: http://aclweb.org/anthology/W/W10/W10-1838.pdf59.5 KCALBC ProjectYue Wang2023-11-26Released
NameT# Ann.AuthorMaintainerUpdated_atStatus

41-60 / 593 show all
bionlp-st-2016-SeeDev-training 35EstelleChaix2023-11-28Released
SPECIES800 3.71 KEvangelos Pafilis, Sune P. Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Aikaterini Vasileiadou, Christos Arvanitidis, Lars Juhl Jensenevangelos2023-11-28Released
PubMed_Structured_Abstracts 131 Kzebet2023-11-28Released
2015-BEL-Sample-2 11.4 KFabio RinaldiNico Colic2023-11-28Released
SMAFIRA_Feedback_Research_Goal 15zebet2023-11-28Released
LitCovid-OGER-BB 308 KFabio RinaldiNico Colic2023-11-28Released
SCAI-Test 1.21 KCALBC ProjectYue Wang2023-11-28Released
CellFinder 4.75 KMariana Neves, Alexander Damaschun, Andreas Kurtz, Ulf LeserMariana Neves2023-11-27Released
123123123 150yaoxinzhi2023-11-27Released
PIR-corpus1 4.44 KUniversity of Delaware and Georgetown University Medical CenterYue Wang2023-11-27Released
LitCovid-sentences-v1 16.5 KJin-Dong Kim2023-11-27Released
Zoonoses_partialAnnotation 266AikoHIRAKI2023-11-27Released
bionlp-st-ge-2016-test-proteins 4.34 KDBCLSJin-Dong Kim2023-11-27Released
bionlp-st-pc-2013-training 7.86 KNaCTeM and KISTIYue Wang2023-11-27Released
craft-sa-dev 490 KUniversity of Colorado Anschutz Medical Campuscraft-st2023-11-27Released
LitCovid-PubTatorCentral 4.64 Kzebet2023-11-27Released
LitCovid-ArguminSci 4.9 Kzebet2023-11-27Released
LitCovid-PD-FMA-UBERON-v1 4.3 KJin-Dong Kim2023-11-27Released
CORD-19-PD-MONDO 6.32 MJin-Dong Kim2023-11-27Released
FSU-PRGE 59.5 KCALBC ProjectYue Wang2023-11-26Released