> top > projects

Projects

NameTDescription# Ann.AuthorMaintainerUpdated_atStatus

61-80 / 119 show all
CORD-PICO Automatic annotation of the CORD-19 dataset with PICO categories. The corpus was automatically labeled with an LSTM-CRF model trained on human-annotated PubMed abstracts from https://github.com/bepnye/EBM-NLP. Currently, titles and abstracts only are annotated using Population, Intervention and Outcome labels, as well as more fine-grained labels such as Age, Drug, Mortality and others.69.6 KSimon Susterssuster2020-04-10Developing
CORD-19-sample-IDO 76Jin-Dong Kim2020-04-13Developing
LitCovid-PMC-OGER-BB Annotating PMC articles with OGER and BioBert, according to an hand-crafted Covid-specific dictionary and the 10 different CRAFT ontologies (http://bionlp-corpora.sourceforge.net/CRAFT/): Chemical Entities of Biological Interest (CHEBI), Cell Ontology (CL), Entrez Gene (UBERON), Gene Ontology (biological process (GO-BP), cellular component (GO-CC), and molecular function (GO-MF), NCBI Taxonomy (NCBITaxon), Protein Ontology (PR), Sequence Ontology (SO)18.1 KFabio RinaldiNico Colic2020-04-15Developing
Epistemic_Statements The goal of this work is to identify epistemic statements in the scientific literature. An epistemic statement is a statement of unknowns, hypotheses, speculations, uncertainties, including statements of claims, hypotheses, questions, explanations, future opportunities, surprises, issues, or concerns within a sentence. The unit of an epistemic statement is a sentence automatically parsed. The classification is binary - epistemic statement or not. We will label epistemic statements only and one can assume that if a statement is not labeled, then it is not an epistemic statement. The classifier is a CRF, trained on gold standard annotations of epistemic statements that are currently ongoing. We report an F-measure of 0.91 after 5-fold cross validation on a test set with 914 statements and an F-measure of 0.9 on a held out document with 130 statements. This project is still under development and is submitted to be used for the CovidLit project and associated Hackathon. Please contact Mayla if you have any questions.1.42 Mmboguslav2020-04-16Developing
CORD-19-sample-MONDO 113Jin-Dong Kim2020-04-18Developing
CORD-19-sample-HP 39Jin-Dong Kim2020-04-18Developing
CORD-19-sample-CHEBI 16Jin-Dong Kim2020-04-19Developing
CORD-19-sample-FMA-UBERON 61Jin-Dong Kim2020-04-19Developing
LitCovid-TimeML 426 KJin-Dong Kim2020-04-28Developing
EVEX-test 104 KSampo Pyysalospyysalo2015-02-26Testing
EVEX-test-2 103 KSampo Pyysalospyysalo2015-02-27Testing
EVEX-test-3 66.7 KSampo Pyysalospyysalo2015-02-27Testing
BLAH2015_Annotations_Adderall 0nestoralvaronestoralvaro2015-03-15Testing
BioASQ-sample collection of PubMed articles which appear in the BioASQ sample data set.0BioASQJin-Dong Kim2015-10-13Testing
SPECIES800_autotagged This project comprises the SPECIES800 corpus documents automatically annotated by the Jensenlab tagger. Annotated entity types are: Genes/proteins from the mentioned organisms (and any human ones) PubChem Compound identifiers NCBI Taxonomy entries Gene Ontology cellular component terms BRENDA Tissue Ontology terms Disease Ontology terms Environment Ontology terms The SPECIES 800 (S800) comprises 800 PubMed abstracts. In its original form species mentions were manually identified and mapped to the corresponding NCBI Taxonomy identifiers. Described in: The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text. Pafilis E, Frankild SP, Fanini L, Faulwetter S, Pavloudi C, et al. (2013). PLoS ONE, 2013, 8(6): e65390. doi:10.1371/journal.pone.0065390. The manually annotated corpus is also available as a PubAnnotation project (see here). 0Evangelos Pafilis, Sampo Pyysalo, Lars Juhl Jensenevangelos2015-11-20Testing
prefixtest test of prefix functionality2Samposmp2015-11-25Testing
biosemtest test submitting Peregrine annotations35.6 KMark Thompsonmarkthompson2015-12-17Testing
pubtator-sample Sample annotation of PubTator produced by Zhiyong Lu et al.28Zhiyong LuJin-Dong Kim2016-01-19Testing
EVEXDB 11.2 MEVEXDBsmp2016-01-28Testing
GlycoBiology-PACDB cGGDB-based annotation to GlycoBiology abstracts3.03 KToshihide Shikanaishikanai2016-02-01Testing
NameT# Ann.AuthorMaintainerUpdated_atStatus

61-80 / 119 show all
CORD-PICO 69.6 KSimon Susterssuster2020-04-10Developing
CORD-19-sample-IDO 76Jin-Dong Kim2020-04-13Developing
LitCovid-PMC-OGER-BB 18.1 KFabio RinaldiNico Colic2020-04-15Developing
Epistemic_Statements 1.42 Mmboguslav2020-04-16Developing
CORD-19-sample-MONDO 113Jin-Dong Kim2020-04-18Developing
CORD-19-sample-HP 39Jin-Dong Kim2020-04-18Developing
CORD-19-sample-CHEBI 16Jin-Dong Kim2020-04-19Developing
CORD-19-sample-FMA-UBERON 61Jin-Dong Kim2020-04-19Developing
LitCovid-TimeML 426 KJin-Dong Kim2020-04-28Developing
EVEX-test 104 KSampo Pyysalospyysalo2015-02-26Testing
EVEX-test-2 103 KSampo Pyysalospyysalo2015-02-27Testing
EVEX-test-3 66.7 KSampo Pyysalospyysalo2015-02-27Testing
BLAH2015_Annotations_Adderall 0nestoralvaronestoralvaro2015-03-15Testing
BioASQ-sample 0BioASQJin-Dong Kim2015-10-13Testing
SPECIES800_autotagged 0Evangelos Pafilis, Sampo Pyysalo, Lars Juhl Jensenevangelos2015-11-20Testing
prefixtest 2Samposmp2015-11-25Testing
biosemtest 35.6 KMark Thompsonmarkthompson2015-12-17Testing
pubtator-sample 28Zhiyong LuJin-Dong Kim2016-01-19Testing
EVEXDB 11.2 MEVEXDBsmp2016-01-28Testing
GlycoBiology-PACDB 3.03 KToshihide Shikanaishikanai2016-02-01Testing