> top > projects

Projects

NameTDescription # Ann.AuthorMaintainerUpdated_atStatus

521-540 / 556 show all
Epistemic_Statements The goal of this work is to identify epistemic statements in the scientific literature. An epistemic statement is a statement of unknowns, hypotheses, speculations, uncertainties, including statements of claims, hypotheses, questions, explanations, future opportunities, surprises, issues, or concerns within a sentence. The unit of an epistemic statement is a sentence automatically parsed. The classification is binary - epistemic statement or not. We will label epistemic statements only and one can assume that if a statement is not labeled, then it is not an epistemic statement. The classifier is a CRF, trained on gold standard annotations of epistemic statements that are currently ongoing. We report an F-measure of 0.91 after 5-fold cross validation on a test set with 914 statements and an F-measure of 0.9 on a held out document with 130 statements. This project is still under development and is submitted to be used for the CovidLit project and associated Hackathon. Please contact Mayla if you have any questions.1.42 Mmboguslav2023-11-24Developing
LocText The manually annotated corpus consists of 100 PubMed abstracts annotated for proteins, subcellular localizations, organisms and relations between them. The focus of the corpus is on annotation of proteins and their subcellular localizations.2.29 KGoldberg et alShrikant Vinchurkar2023-11-29Released
NCBIDiseaseCorpus The NCBI disease corpus is fully annotated at the mention and concept level to serve as a research resource for the biomedical natural language processing community.6.85 KRezarta Islamaj Doğan,Robert Leaman,Zhiyong LuChih-Hsuan Wei2023-11-29Released
PennBioIE The PennBioIE corpus (0.9) covers two domains of biomedical knowledge. One is the inhibition of the cytochrome P450 family of enzymes (CYP450 or CYP for short) , and the other domain is the molecular genetics of dance (oncology or onco for short).23.8 KUPenn Biomedical Information Extraction ProjectYue Wang2023-11-26Released
PIR-corpus1 The Protein Information Resource (PIR) is not biased towards any particular biomedical domain, and is expected to provide more diverse protein names in a given sample size. Annotation category: protein, compound-protein, acronym.4.44 KUniversity of Delaware and Georgetown University Medical CenterYue Wang2023-11-27Released
PIR-corpus2 The protein tag was used to tag proteins, or protein-associated or -related objects, such as domains, pathways, expression of gene. Annotation guideline: http://pir.georgetown.edu/pirwww/about/doc/manietal.pdf5.52 KUniversity of Delaware and Georgetown University Medical CenterYue Wang2023-11-29Released
GlycoConjugate-collection The PubMed entries (titles and abstracts) from the journal of GlycoConjugate0Jin-Dong Kim2023-11-28Developing
bionlp-st-cg-2013-training The training dataset from the cancer genetics task in the BioNLP Shared Task 2013. Composed of anatomical and molecular entities.10.9 KNaCTeMYue Wang2023-11-28Released
bionlp-st-epi-2011-training The training dataset from the Epigenetics and Post-translational Modifications (EPI) task in the BioNLP Shared Task 2011. The core entities of the task are genes and gene products (RNA and proteins), identified in the data simply as "Protein" annotations. 7.59 KGENIAYue Wang2023-11-29Released
bionlp-st-id-2011-training The training dataset from the infectious diseases (ID) task in the BioNLP Shared Task 2011. Entity types: - Genes and gene products: gene, RNA, and protein name mentions. - Two-component systems: mentions of the names of two-component regulatory systems, frequently embedding the names of the two Proteins forming the system.- Chemicals: mentions of chemical compounds such as "NaCL".- Organisms: mentions of organism names or organism specification through specific properties (e.g. "graRS mutant").- Regulons/Operons: mentions of names of specific regulons and operons.5.61 KUniversity of Tokyo Tsujii Laboratory, NaCTeM and Biocomplexity Institute of Virginia TechYue Wang2023-11-28Released
bionlp-st-pc-2013-training The training dataset from the pathway curation (PC) task in the BioNLP Shared Task 2013. The entity types defined in the PC task are simple chemical, gene or gene product, complex and cellular component.7.86 KNaCTeM and KISTIYue Wang2023-11-27Released
bionlp-st-gro-2013-training The training data set of the BioNLP-ST 2013 GRO task, including 150 MEDLINE abstracts that are annotated with concepts and relations of the Gene Regulation Ontology (GRO; http://www.ebi.ac.uk/Rebholz-srv/GRO/GRO.html)8.02 KJung-jae KimJung-jae Kim2023-11-29Testing
jnlpba-st-training The training data used in the task came from the GENIA version 3.02 corpus, This was formed from a controlled search on MEDLINE using the MeSH terms "human", "blood cells" and "transcription factors". From this search, 1,999 abstracts were selected and hand annotated according to a small taxonomy of 48 classes based on a chemical classification. Among the classes, 36 terminal classes were used to annotate the GENIA corpus. For the shared task only the classes protein, DNA, RNA, cell line and cell type were used. The first three incorporate several subclasses from the original taxonomy while the last two are interesting in order to make the task realistic for post-processing by a potential template filling application. The publication year of the training set ranges over 1990~1999.51.1 KGENIAYue Wang2023-11-26Released
CHEMDNER-training-test The training subset of the CHEMDNER corpus29.4 KMartin Krallinger et al.Jin-Dong Kim2023-11-27Testing
NEUROSES This corpus is composed of PubMed articles containing cognitive enhancers and anti-depressants drug mentions. The selected sentences are automatically annotated using the NCBO Annotator with the Chemical Entities of Biological Interest (CHEBI) and Phenotypic Quality Ontology (PATO) ontologies, we also produced annotations using PhenoMiner ontology via a dictionary-based tagger.2.14 Mnestoralvaro2023-11-24Beta
Ab3P-abbreviations This corpus was developed during the creation of the Ab3P abbreviation definition identification tool. It includes 1250 manually annotated MEDLINE records. This gold standard includes 1221 abbreviation-definition pairs. Abbreviation definition identification based on automatic precision estimates Sunghwan Sohn, Donald C Comeau, Won Kim and W John Wilbur BMC Bioinformatics20089:402 DOI: 10.1186/1471-2105-9-4022.33 KSunghwan Sohn, Donald C Comeau, Won Kim and W John Wilburcomeau2023-11-29Beta
medical_relation This is about medical inner relation.0ruleryangruleryang2023-11-29Testing
Zoonoses This is a main data sets of Zoonoses project used by PanZoora.10.3 KAikoHIRAKI2023-11-26Developing
Zoonoses_partialAnnotation This is a part of Zoonoses project used by PanZoora. But Zoonoses project provides whole manual annotated data but this is partial ones.266AikoHIRAKI2023-11-27Released
KYMEKA20240117Test This is a project to express the linking of terms and ontologies (DOID, FMA, Radlex) used in the dataset 'Annotationdata_type-2' used in BLAH8_Radiological Causal Annotation.0Kyung-Min ChaeKyung-Min Chae2024-01-17Testing
NameT# Ann.AuthorMaintainerUpdated_atStatus

521-540 / 556 show all
Epistemic_Statements 1.42 Mmboguslav2023-11-24Developing
LocText 2.29 KGoldberg et alShrikant Vinchurkar2023-11-29Released
NCBIDiseaseCorpus 6.85 KRezarta Islamaj Doğan,Robert Leaman,Zhiyong LuChih-Hsuan Wei2023-11-29Released
PennBioIE 23.8 KUPenn Biomedical Information Extraction ProjectYue Wang2023-11-26Released
PIR-corpus1 4.44 KUniversity of Delaware and Georgetown University Medical CenterYue Wang2023-11-27Released
PIR-corpus2 5.52 KUniversity of Delaware and Georgetown University Medical CenterYue Wang2023-11-29Released
GlycoConjugate-collection 0Jin-Dong Kim2023-11-28Developing
bionlp-st-cg-2013-training 10.9 KNaCTeMYue Wang2023-11-28Released
bionlp-st-epi-2011-training 7.59 KGENIAYue Wang2023-11-29Released
bionlp-st-id-2011-training 5.61 KUniversity of Tokyo Tsujii Laboratory, NaCTeM and Biocomplexity Institute of Virginia TechYue Wang2023-11-28Released
bionlp-st-pc-2013-training 7.86 KNaCTeM and KISTIYue Wang2023-11-27Released
bionlp-st-gro-2013-training 8.02 KJung-jae KimJung-jae Kim2023-11-29Testing
jnlpba-st-training 51.1 KGENIAYue Wang2023-11-26Released
CHEMDNER-training-test 29.4 KMartin Krallinger et al.Jin-Dong Kim2023-11-27Testing
NEUROSES 2.14 Mnestoralvaro2023-11-24Beta
Ab3P-abbreviations 2.33 KSunghwan Sohn, Donald C Comeau, Won Kim and W John Wilburcomeau2023-11-29Beta
medical_relation 0ruleryangruleryang2023-11-29Testing
Zoonoses 10.3 KAikoHIRAKI2023-11-26Developing
Zoonoses_partialAnnotation 266AikoHIRAKI2023-11-27Released
KYMEKA20240117Test 0Kyung-Min ChaeKyung-Min Chae2024-01-17Testing