> top > projects

Projects

NameTDescription# Ann.AuthorMaintainer Updated_atStatus

381-400 / 590 show all
LitCovid-sentences 5.63 MJin-Dong Kim2023-11-24Developing
CORD-19_Custom_license_subset The Custom license subset of the CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.5.08 MJin-Dong Kim2023-11-24Released
CORD-19-PD-UBERON PubDictionaries annotation for UBERON terms - updated at 2020-04-30 It is disease term annotation based on Uberon. The terms in Uberon are uploaded in PubDictionaries (Uberon), with which the annotations in this project are produced. The parameter configuration used for this project is here. Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved.1.42 MJin-Dong Kim2023-11-24Released
test10 212Jin-Dong Kim2023-11-24
tutorial1 5Jin-Dong Kim2023-11-29Testing
PubMed-German-test A collection of PubMed abstracts which are written in German0Jin-Dong Kim2023-11-24Developing
PubMed-2017 abstracts published in 2017.0Jin-Dong Kim2023-11-24Developing
MENA-example 5Jin-Dong Kim2023-11-29
speech-test 6Jin-Dong Kim2023-11-26Testing
GlyCosmos600-MAT 863Jin-Dong Kim2023-11-29Testing
CORD-19-SciBite-sentences 11.2 KJin-Dong Kim2023-11-26Testing
LitCovid-PD-FMA-UBERON-v1 PubDictionaries annotation for anatomy terms - updated at 2020-04-20 Disease term annotation based on FMA and Uberon. Version 2020-04-20. The terms in FMA and Uberon are loaded in PubDictionaries (FMA and Uberon), with which the annotations in this project are produced. The parameter configuration used for this project is here for FMA and there for Uberon. Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved.4.3 KJin-Dong Kim2023-11-27Released
bionlp-st-ge-2016-test-proteins Protein annotations to the benchmark test data set of the BioNLP-ST 2016 GE task. A participant of the GE task may import the documents and annotations of this project to his/her own project, to begin with producing event annotations. For more details, please refer to the benchmark test data set (bionlp-st-ge-2016-test). 4.34 KDBCLSJin-Dong Kim2023-11-27Released
GlyCosmos600-GlycoProteins GlycoProtein annotations were made using the glycoprotein-name dictionary on PubDictionaries: http://pubannotation.org/projects/GlyCosmos600-docs The documents were imported from the GlyCosmos600-docs project: http://pubannotation.org/projects/GlyCosmos600-docs3.68 KJin-Dong Kim2023-11-27Testing
LitCoin-GeneOrGeneProduct-v0 https://pubdictionaries.org/text_annotation.json?dictionary=NCBIGene-NER&threshold=0.85&abbreviation=true15.8 KJin-Dong Kim2023-11-29
bionlp-st-ge-2016-coref Coreference annotation to the benchmark data set (reference and test) of BioNLP-ST 2016 GE task. For detailed information, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test).853DBCLSJin-Dong Kim2024-06-17Released
LitCovid-PD-HP 922 KJin-Dong Kim2023-11-28Beta
pubmed-sentences-benchmark A benchmark data for text segmentation into sentences. The source of annotation is the GENIA treebank v1.0. Following is the process taken. began with the GENIA treebank v1.0. sentence annotations were extracted and converted to PubAnnotation JSON. uploaded. 12 abstracts met alignment failure. among the 12 failure cases, 4 had a dot('.') character where there should be colon (':'). They were manually fixed then successfully uploaded: 7903907, 8053950, 8508358, 9415639. among the 12 failed abstracts, 8 were "250 word truncation" cases. They were manually fixed and successfully uploaded. During the fixing, manual annotations were added for the missing pieces of text. 30 abstracts had extra text in the end, indicating copyright statement, e.g., "Copyright 1998 Academic Press." They were annotated as a sentence in GTB. However, the text did not exist anymore in PubMed. Therefore, the extra texts were removed, together with the sentence annotation to them. 18.4 KGENIA projectJin-Dong Kim2023-11-28Released
example-dialog 0Jin-Dong Kim2023-11-27Testing
bionlp-st-ge-2016-uniprot UniProt protein annotation to the benchmark data set of BioNLP-ST 2016 GE task: reference data set (bionlp-st-ge-2016-reference) and test data set (bionlp-st-ge-2016-test). The annotations are produced based on a dictionary which is semi-automatically compiled for the 34 full paper articles included in the benchmark data set (20 in the reference data set + 14 in the test data set). For detailed information about BioNLP-ST GE 2016 task data sets, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test). 16.2 KDBCLSJin-Dong Kim2023-11-29Beta
NameT# Ann.AuthorMaintainer Updated_atStatus

381-400 / 590 show all
LitCovid-sentences 5.63 MJin-Dong Kim2023-11-24Developing
CORD-19_Custom_license_subset 5.08 MJin-Dong Kim2023-11-24Released
CORD-19-PD-UBERON 1.42 MJin-Dong Kim2023-11-24Released
test10 212Jin-Dong Kim2023-11-24
tutorial1 5Jin-Dong Kim2023-11-29Testing
PubMed-German-test 0Jin-Dong Kim2023-11-24Developing
PubMed-2017 0Jin-Dong Kim2023-11-24Developing
MENA-example 5Jin-Dong Kim2023-11-29
speech-test 6Jin-Dong Kim2023-11-26Testing
GlyCosmos600-MAT 863Jin-Dong Kim2023-11-29Testing
CORD-19-SciBite-sentences 11.2 KJin-Dong Kim2023-11-26Testing
LitCovid-PD-FMA-UBERON-v1 4.3 KJin-Dong Kim2023-11-27Released
bionlp-st-ge-2016-test-proteins 4.34 KDBCLSJin-Dong Kim2023-11-27Released
GlyCosmos600-GlycoProteins 3.68 KJin-Dong Kim2023-11-27Testing
LitCoin-GeneOrGeneProduct-v0 15.8 KJin-Dong Kim2023-11-29
bionlp-st-ge-2016-coref 853DBCLSJin-Dong Kim2024-06-17Released
LitCovid-PD-HP 922 KJin-Dong Kim2023-11-28Beta
pubmed-sentences-benchmark 18.4 KGENIA projectJin-Dong Kim2023-11-28Released
example-dialog 0Jin-Dong Kim2023-11-27Testing
bionlp-st-ge-2016-uniprot 16.2 KDBCLSJin-Dong Kim2023-11-29Beta