> top > projects

Projects

NameTDescription# Ann. AuthorMaintainerUpdated_atStatus

1-20 / 316 show all
LitCovid-PD-MONDO 2.26 MJin-Dong Kim2023-11-24
0_colil 781 KYue Wang2023-11-24
pqqtest_sentence 565 Kyaoxinzhi2023-11-29Testing
LitCovid-PD-UBERON 540 KJin-Dong Kim2023-11-29
Oryza-OGER 462 Kfabiorinaldi2023-11-29
21k_plant_trait_mention 333 Kxzyao2023-11-29Testing
GoldHamster 285 Kzebet2023-11-29Beta
OryzaGP_2021_v2 OryzaGP_2021_v2 will use a second annotator 208 Klarmande2023-11-29Developing
PubMed_Structured_Abstracts Sections (zones) as retrieved from PubMed.131 Kzebet2023-11-28Released
c_corpus Documents included in the c_corpus: https://github.com/SMAFIRA/c_corpus/blob/master/SMAFIRAc_0.4_Annotations.csv107 K2023-11-29Released
craft-ca-core-ex-dev Development data for CRAFT CA shared task, core concepts + EXTENSIONS. This project contains the development (training) annotations for the Concept Annotation task of the CRAFT Shared Task 2019. This particular set of concept annotations is the "core+extensions" set. See the task description for details, but this set contains annotations to concepts that appear in the original 10 Open Biomedical Ontologies used for annotation PLUS annotations to extension classes created using the core concepts.90.2 KUniversity of Colorado Anschutz Medical Campuscraft-st2023-11-29Released
GENIAcorpus multi_cell (1,782) mono_cell (222) virus (2,136) protein_family_or_group (8,002) protein_complex (2,394) protein_molecule (21,290) protein_subunit (942) protein_substructure (129) protein_domain_or_region (1,044) protein_other (97) peptide (521) amino_acid_monomer (784) DNA_family_or_group (332) DNA_molecule (664) DNA_substructure (2) DNA_domain_or_region (39) DNA_other (16) RNA_family_or_group (1,545) RNA_molecule (554) RNA_substructure (106) RNA_domain_or_region (8,237) RNA_other (48) polynucleotide (259) nucleotide (243) lipid (2,375) carbohydrate (99) other_organic_compound (4,113) body_part (461) tissue (706) cell_type (7,473) cell_component (679) cell_line (4,129) other_artificial_source (211) inorganic (258) atom (342) other (21,056) 78.9 KGENIA ProjectYue Wang2023-11-29Released
craft-ca-core-dev Development data for CRAFT CA shared task, core concepts only. This project contains the development (training) annotations for the Concept Annotation task of the CRAFT Shared Task 2019. This particular set of concept annotations is the "core" set. See the task description for details, but this set contains only annotations to concepts that appear in the original 10 Open Biomedical Ontologies used for annotation. (That is to say, it does not contain any annotations to extension classes).59.8 KUniversity of Colorado Anschutz Medical Campuscraft-st2023-11-29Released
PMA_MER PMAs annotated using MERpy.58.9 KStefano Rensitherightstef2023-11-29Developing
jnlpba-st-training The training data used in the task came from the GENIA version 3.02 corpus, This was formed from a controlled search on MEDLINE using the MeSH terms "human", "blood cells" and "transcription factors". From this search, 1,999 abstracts were selected and hand annotated according to a small taxonomy of 48 classes based on a chemical classification. Among the classes, 36 terminal classes were used to annotate the GENIA corpus. For the shared task only the classes protein, DNA, RNA, cell line and cell type were used. The first three incorporate several subclasses from the original taxonomy while the last two are interesting in order to make the task realistic for post-processing by a potential template filling application. The publication year of the training set ranges over 1990~1999.51.1 KGENIAYue Wang2023-11-26Released
genia-medco-coref Coreference annotation made to the Genia corpus, following the MUC annotation scheme. It is a product of the collaboration between the Genia and the MedCo projects.45.9 KMedCo project & Genia projectJin-Dong Kim2023-11-24Developing
FirstAuthor_s 新着論文レビューで疾患名のあるレビューにおけるUniProtIDと薬剤等化合物日化辞ID (Japanese)39.3 KAikoHIRAKI2023-11-29Developing
GO-BP Annotation for biological processes as defined in the "Biological Process" subset of Gene Ontology35.4 KDBCLSJin-Dong Kim2023-11-29Developing
Genomics_Informatics Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining.35.3 KHyun-Seok Parkewha-bio2023-11-29Beta
GlycoBiology-NCBITAXON NCBITAXON-based annotation to GlycoBiology abstracts32.7 Kshuo502023-11-29Testing
NameT# Ann. AuthorMaintainerUpdated_atStatus

1-20 / 316 show all
LitCovid-PD-MONDO 2.26 MJin-Dong Kim2023-11-24
0_colil 781 KYue Wang2023-11-24
pqqtest_sentence 565 Kyaoxinzhi2023-11-29Testing
LitCovid-PD-UBERON 540 KJin-Dong Kim2023-11-29
Oryza-OGER 462 Kfabiorinaldi2023-11-29
21k_plant_trait_mention 333 Kxzyao2023-11-29Testing
GoldHamster 285 Kzebet2023-11-29Beta
OryzaGP_2021_v2 208 Klarmande2023-11-29Developing
PubMed_Structured_Abstracts 131 Kzebet2023-11-28Released
c_corpus 107 K2023-11-29Released
craft-ca-core-ex-dev 90.2 KUniversity of Colorado Anschutz Medical Campuscraft-st2023-11-29Released
GENIAcorpus 78.9 KGENIA ProjectYue Wang2023-11-29Released
craft-ca-core-dev 59.8 KUniversity of Colorado Anschutz Medical Campuscraft-st2023-11-29Released
PMA_MER 58.9 KStefano Rensitherightstef2023-11-29Developing
jnlpba-st-training 51.1 KGENIAYue Wang2023-11-26Released
genia-medco-coref 45.9 KMedCo project & Genia projectJin-Dong Kim2023-11-24Developing
FirstAuthor_s 39.3 KAikoHIRAKI2023-11-29Developing
GO-BP 35.4 KDBCLSJin-Dong Kim2023-11-29Developing
Genomics_Informatics 35.3 KHyun-Seok Parkewha-bio2023-11-29Beta
GlycoBiology-NCBITAXON 32.7 Kshuo502023-11-29Testing