> top > projects

Projects

NameTDescription # Ann.AuthorMaintainerUpdated_atStatus

241-260 / 556 show all
OryzaGP1 A dataset for Named Entity Recognition for rice gene0Huy Do. Pierre Larmande2019-01-31Uploading
AGCA_Sue Active Gene Annotation Corpus for the Application in Drug Repurposing Discovery0Jingbo Xia, Xuan Qin, Kaiyin Zhou2023-11-29Developing
Preeclampsia A collection of titles and abstracts of "Preeclampsia"-related papers. They were extracted from PubMed using the MeSH term "Preeclampsia" and specifying the language to be "English, on 11th September, 2017. The texts were then annotated by PubDictionaries using the dictionary "Preeclampsia".58.7 Kcallahan_tiff2023-11-29Developing
PubMed-French-test A collection of PubMed abstract written in French0Jin-Dong Kim2023-11-29Developing
PubMed-German-test A collection of PubMed abstracts which are written in German0Jin-Dong Kim2023-11-24Developing
NGLY1-deficiency A collection of PubMed abstracts that may be related to NGLY1 deficiency.60.5 KJin-Dong Kim2023-11-29Developing
MeasurableQuantitativeAnnotation A collection and annotation the measurable quantity information from 3202 pubmed article, which can be used for the task of extracting measurable quantity information. Annotation category: entity, num, unit.2.84 KWenjieNie2023-11-29Testing
PubMed-2017 abstracts published in 2017.0Jin-Dong Kim2023-11-24Developing
pubmed-2016 abstracts published in 20160Jin-Dong Kim2023-11-28
PubMed-2000 abstracts published in 2000.0Jin-Dong Kim2023-11-29Developing
PubCasesCollection abstracts in PubCases0Jin-Dong Kim2023-11-29
RELISH-DB Abstracts contained in the data of the RELISH-DB (https://relishdb.ict.griffith.edu.au) made available for download here. Data was downloaded from here: https://figshare.com/projects/RELISH-DB/60095 Related publication: https://academic.oup.com/database/article/doi/10.1093/database/baz085/5608006#20072202302023-11-29Released
pubmed-sentences-benchmark A benchmark data for text segmentation into sentences. The source of annotation is the GENIA treebank v1.0. Following is the process taken. began with the GENIA treebank v1.0. sentence annotations were extracted and converted to PubAnnotation JSON. uploaded. 12 abstracts met alignment failure. among the 12 failure cases, 4 had a dot('.') character where there should be colon (':'). They were manually fixed then successfully uploaded: 7903907, 8053950, 8508358, 9415639. among the 12 failed abstracts, 8 were "250 word truncation" cases. They were manually fixed and successfully uploaded. During the fixing, manual annotations were added for the missing pieces of text. 30 abstracts had extra text in the end, indicating copyright statement, e.g., "Copyright 1998 Academic Press." They were annotated as a sentence in GTB. However, the text did not exist anymore in PubMed. Therefore, the extra texts were removed, together with the sentence annotation to them. 18.4 KGENIA projectJin-Dong Kim2023-11-28Released
guideline annotations 5 guideline annotations with custom vocab0Tiffany Leung2015-11-07Developing
Virus300 300 abstracts from virology journals annotated with viral proteins and species0http://aclweb.org/anthology/W/W17/W17-2311.pdfhelencook2017-08-07Released
AnEM_full-texts 250 documents selected randomly from full-text papers Entity types: organism subdivision, anatomical system, organ, multi-tissue structure, tissue, cell, developing anatomical structure, cellular component, organism substance, immaterial anatomical entity and pathological formation Together with AnEM_abstracts, it is probably the largest manually annotated corpus on anatomical entities.687NaCTeMYue Wang2023-11-29Uploading
AnEM_abstracts 250 documents selected randomly from citation abstracts Entity types: organism subdivision, anatomical system, organ, multi-tissue structure, tissue, cell, developing anatomical structure, cellular component, organism substance, immaterial anatomical entity and pathological formation Together with AnEM_full-texts, it is probably the largest manually annotated corpus on anatomical entities.1.91 KNaCTeMYue Wang2023-11-29Released
FA_Top100Plus-Disease 2/2 FirstAuthor Top100+7 for diseases MONDO & HPO246AikoHIRAKI2023-11-29Testing
BioLarkPubmedHPO 228 abstracts manually annotated with Human Phenotype Ontology (HPO) concepts and harmonized by three curators, which can be used as a reference standard for free text annotation of human phenotypes. For more info, please see Groza et al. "Automatic concept recognition using the human phenotype ontology reference and test suite corpora", 2015.7.16 KTudor Grozasimon2023-11-29Released
FA_Top100-Disease 1/2 FirstAuthor Top100 (201811-201910) for diseases MONDO & HPO2.14 KAikoHIRAKI2023-11-29
NameT# Ann.AuthorMaintainerUpdated_atStatus

241-260 / 556 show all
OryzaGP1 0Huy Do. Pierre Larmande2019-01-31Uploading
AGCA_Sue 0Jingbo Xia, Xuan Qin, Kaiyin Zhou2023-11-29Developing
Preeclampsia 58.7 Kcallahan_tiff2023-11-29Developing
PubMed-French-test 0Jin-Dong Kim2023-11-29Developing
PubMed-German-test 0Jin-Dong Kim2023-11-24Developing
NGLY1-deficiency 60.5 KJin-Dong Kim2023-11-29Developing
MeasurableQuantitativeAnnotation 2.84 KWenjieNie2023-11-29Testing
PubMed-2017 0Jin-Dong Kim2023-11-24Developing
pubmed-2016 0Jin-Dong Kim2023-11-28
PubMed-2000 0Jin-Dong Kim2023-11-29Developing
PubCasesCollection 0Jin-Dong Kim2023-11-29
RELISH-DB 02023-11-29Released
pubmed-sentences-benchmark 18.4 KGENIA projectJin-Dong Kim2023-11-28Released
guideline annotations 0Tiffany Leung2015-11-07Developing
Virus300 0http://aclweb.org/anthology/W/W17/W17-2311.pdfhelencook2017-08-07Released
AnEM_full-texts 687NaCTeMYue Wang2023-11-29Uploading
AnEM_abstracts 1.91 KNaCTeMYue Wang2023-11-29Released
FA_Top100Plus-Disease 246AikoHIRAKI2023-11-29Testing
BioLarkPubmedHPO 7.16 KTudor Grozasimon2023-11-29Released
FA_Top100-Disease 2.14 KAikoHIRAKI2023-11-29