PT_NER_NEL | | Annotations in Portuguese COVID-19 related abstracts from MeSH terminology | 245 | LASIGE-DeST | pruas_18 | 2023-11-29 | Developing | |
bionlp-st-ge-2016-uniprot | | UniProt protein annotation to the benchmark data set of BioNLP-ST 2016 GE task: reference data set (bionlp-st-ge-2016-reference) and test data set (bionlp-st-ge-2016-test).
The annotations are produced based on a dictionary which is semi-automatically compiled for the 34 full paper articles included in the benchmark data set (20 in the reference data set + 14 in the test data set).
For detailed information about BioNLP-ST GE 2016 task data sets, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test).
| 16.2 K | DBCLS | Jin-Dong Kim | 2023-11-29 | Beta | |
bionlp-st-gro-2013-training | | The training data set of the BioNLP-ST 2013 GRO task, including 150 MEDLINE abstracts that are annotated with concepts and relations of the Gene Regulation Ontology (GRO; http://www.ebi.ac.uk/Rebholz-srv/GRO/GRO.html) | 8.02 K | Jung-jae Kim | Jung-jae Kim | 2023-11-29 | Testing | |
blah6_medical_device | | BLAH6 hackathon project to annotate medical device indications in premarket approval statement summaries. The documents in this project serve as a corpus of premarket approval (PMA) statements that have undergone quality control. In particular, we have (1) removed non-ascii characters, (2) fixed some text segmentation errors, and (3) fixed some capitalization errors. | 0 | Stefano Rensi | therightstef | 2023-11-29 | Beta | |
c_corpus | | Documents included in the c_corpus: https://github.com/SMAFIRA/c_corpus/blob/master/SMAFIRAc_0.4_Annotations.csv | 107 K | | | 2023-11-29 | Released | |
bionlp-st-gro-2013-development | | The development data set of the BioNLP-ST 2013 GRO task, including 50 MEDLINE abstracts that are annotated with concepts and relations of the Gene Regulation Ontology (GRO; http://www.ebi.ac.uk/Rebholz-srv/GRO/GRO.html) | 2.66 K | Jung-jae Kim | Jung-jae Kim | 2023-11-29 | Testing | |
craft-ca-core-ex-dev | | Development data for CRAFT CA shared task, core concepts + EXTENSIONS. This project contains the development (training) annotations for the Concept Annotation task of the CRAFT Shared Task 2019. This particular set of concept annotations is the "core+extensions" set. See the task description for details, but this set contains annotations to concepts that appear in the original 10 Open Biomedical Ontologies used for annotation PLUS annotations to extension classes created using the core concepts. | 90.2 K | University of Colorado Anschutz Medical Campus | craft-st | 2023-11-29 | Released | |
LitCoin-Chemical-MeSH-CHEBI | | ChemicalEntity:
Annotated by PD-MeSH2022_CHEBI_tuned-B | 3.84 K | | yucca | 2023-11-29 | Testing | |
disease_ontology_term_microbe | | | 5 | | evangelos | 2023-11-29 | Developing | |
bionlp-st-bb3-2016-training | | Entity (bacteria, habitats and geographical places) annotation to the training dataset of the BioNLP-ST 2016 BB task.
For more information, please refer to bionlp-st-bb3-2016-development and bionlp-st-bb3-2016-test.
Bacteria
Bacteria entities are annotated as contiguous spans of text that contains a full unambiguous prokaryote taxon name, the type label is Bacteria. The Bacteria type is a taxon, at any taxonomic level from phylum (Eubacteria) to strain. The category that the text entities have to be assigned to is the most specific and unique category of the NCBI taxonomy resource. In case a given strain, or a group of strains is not referenced by NCBI, it is assigned with the closest taxid in the taxonomy.
Habitat
Habitat entities are annotated as spans of text that contains a complete mention of a potential habitat for bacteria, the type label is Habitat. Habitat entities are assigned one or several concepts from the habitat subpart of the OntoBiotope ontology. The assigned concepts are as specific as possible. OntoBiotope defines most relevant microorganism habitats from all areas considered by microbial ecology (hosts, natural environment, anthropized environments, food, medical, etc.). Habitat entities are rarely referential entities, they are usually noun phrases including properties and modifiers. There are rare cases of habitats referred with adjectives or verbs. The spans are generally contiguous but some of them are discontinuous in order to cope with conjunctions.
Geographical
Geographical entities are geographical and organization places denoted by official names. | 1.28 K | INRA | Yue Wang | 2023-11-29 | Released | |
excludesZoonoses | | | 25 | | AikoHIRAKI | 2023-11-29 | Developing | |
ichiharatest_150825 | | test | 0 | ichihara_hisako | Hisako Ichihara | 2023-11-29 | Testing | |
ichiharatest_150830_1 | | test | 99 | | Hisako Ichihara | 2023-11-29 | Testing | |
korean_corpus_sample | | | 53 | | donghwan kim | 2023-11-29 | Testing | |
pqqtest_sentence | | | 565 K | | yaoxinzhi | 2023-11-29 | Testing | |
LitCovid-sample-HP | | | 0 | | Jin-Dong Kim | 2023-11-29 | Testing | |
GlycoBiology-Motifs | | | 4.15 K | | Jin-Dong Kim | 2023-11-29 | | |
CORD-19-PD-HP | | PubDictionaries annotation for HP terms - updated at 2020-04-30
Disease term annotation based on HP.
Version 2020-04-20.
The terms in HP are loaded in PubDictionaries, with which the annotations in this project are produced. The parameter configuration used for this project is here.
Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved. | 1.15 M | | Jin-Dong Kim | 2023-11-29 | Released | |
ggdb-test | | | 2.4 K | | Jin-Dong Kim | 2023-11-29 | Testing | |
LitCovid_AGAC | | | 904 | | xiajingbo | 2023-11-29 | | |