LitCovid-sample-PD-UBERON | | PubDictionaries annotation for UBERON terms - updated at 2020-04-30
It is annotation for anatomical entities based on Uberon.
The terms in Uberon are uploaded in PubDictionaries
(Uberon),
with which the annotations in this project are produced.
The parameter configuration used for this project is
here.
Note that it is an automatically generated dictionary-based annotation.
It will be updated periodically, as the documents are increased, and the dictionary is improved.
| 310 | | Jin-Dong Kim | 2023-11-28 | Beta | |
sonoma2 | | sonoma2 | 9.09 K | Standigm | chanung | 2023-11-29 | Beta | |
bionlp-st-ge-2016-uniprot | | UniProt protein annotation to the benchmark data set of BioNLP-ST 2016 GE task: reference data set (bionlp-st-ge-2016-reference) and test data set (bionlp-st-ge-2016-test).
The annotations are produced based on a dictionary which is semi-automatically compiled for the 34 full paper articles included in the benchmark data set (20 in the reference data set + 14 in the test data set).
For detailed information about BioNLP-ST GE 2016 task data sets, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test).
| 16.2 K | DBCLS | Jin-Dong Kim | 2023-11-29 | Beta | |
TEST-ChemicalEntity | | ChemicalEntity : Annotated by PD-MeSH2022_CHEBI_tuned-B | 827 | | yucca | 2023-11-29 | Beta | |
LitCovid-sample-PD-GO-BP-0 | | | 708 | | Jin-Dong Kim | 2023-11-29 | Beta | |
LitCovid-sample-PD-NCBITaxon | | | 1.35 K | | Jin-Dong Kim | 2023-11-29 | Beta | |
LitCovid-sample-sentences | | | 2.3 K | | Jin-Dong Kim | 2023-11-29 | Beta | |
IMDB-NLP | | Annotations for chunking and semantic role labeling based on in-memory databases. | 0 | | | 2016-05-06 | Uploading | |
OryzaGP | | A dataset for Named Entity Recognition for rice gene | 29.1 K | Huy Do and Pierre Larmande | Yue Wang | 2023-11-24 | Uploading | |
LitCovid-sample-Pubtator | | | 3.86 K | | Jin-Dong Kim | 2023-11-28 | Uploading | |
chemicals | | | 0 | | pruas_18 | 2023-11-29 | Uploading | |
pubmed-enju-pas | | Annotating PubMed abstracts for predicate-argument structure (PAS). Enju 2.4.2 is used to automatically compute PAS. | 19.1 M | Enju | Jin-Dong Kim | 2023-11-24 | Developing | |
Allie | | An annotation set of abbreviations and expanded forms extracted from PubMed/MEDLINE by machines. | 8.7 M | Database Center for Life Science | Yasunori Yamamoto | 2023-11-24 | Developing | |
sentences | | Sentence segmentation annotation.
Automatic annotation by TextSentencer. | 6.96 M | DBCLS | Jin-Dong Kim | 2023-11-24 | Developing | |
LitCovid-sentences | | | 5.63 M | | Jin-Dong Kim | 2023-11-24 | Developing | |
LitCovid-PD-CLO | | | 3.73 M | | Jin-Dong Kim | 2023-11-24 | Developing | |
LitCovid-PMC-OGER-BB | | Annotating PMC articles with OGER and BioBert, according to an hand-crafted Covid-specific dictionary and the 10 different CRAFT ontologies (http://bionlp-corpora.sourceforge.net/CRAFT/):
Chemical Entities of Biological Interest (CHEBI),
Cell Ontology (CL),
Entrez Gene (UBERON),
Gene Ontology (biological process (GO-BP), cellular component (GO-CC), and molecular function (GO-MF),
NCBI Taxonomy (NCBITaxon),
Protein Ontology (PR),
Sequence Ontology (SO) | 3.14 M | Fabio Rinaldi | Nico Colic | 2023-11-24 | Developing | |
PMID_GLOBAL | | Global sentencer tagging of public PMID abstracts.
Open and publicly available to the global community. | 2.24 M | | alo33 | 2023-11-24 | Developing | |
LitCovid-PD-CHEBI | | | 1.43 M | | Jin-Dong Kim | 2023-11-24 | Developing | |
Epistemic_Statements | | The goal of this work is to identify epistemic statements in the scientific literature. An epistemic statement is a statement of unknowns, hypotheses, speculations, uncertainties, including statements of claims, hypotheses, questions, explanations, future opportunities, surprises, issues, or concerns within a sentence. The unit of an epistemic statement is a sentence automatically parsed. The classification is binary - epistemic statement or not. We will label epistemic statements only and one can assume that if a statement is not labeled, then it is not an epistemic statement.
The classifier is a CRF, trained on gold standard annotations of epistemic statements that are currently ongoing. We report an F-measure of 0.91 after 5-fold cross validation on a test set with 914 statements and an F-measure of 0.9 on a held out document with 130 statements. This project is still under development and is submitted to be used for the CovidLit project and associated Hackathon.
Please contact Mayla if you have any questions. | 1.42 M | | mboguslav | 2023-11-24 | Developing | |