PubAnnotation

> top > users > Jin-Dong Kim

Jin-Dong Kim

User info

Collections

Name		Description	Updated at

« 1 2 11-12 / 12 show all
CORD-19		CORD-19 (COVID-19 Open Research Dataset) is a free, open resource for the global research community provided by the Allen Institute for AI: https://pages.semanticscholar.org/coronavirus-research. As of 2020-03-20, it contains over 29,000 full text articles. This CORD-19 collection at PubAnnotation is prepared for the purpose of collecting annotations to the texts, so that they can be easily accessed and utilized. If you want to contribute with your annotation, take the documents in the CORD-19_All_docs project, produce your annotation to the texts using your annotation system, and contribute the annotation back to PubAnnotation (HowTo). All the contributed annotations will become publicly available. Please note that, during uploading your annotation data, you do not need to be worried about slight changes in the text: PubAnnotation will automatically catch them and adjust the positions appropriately. Once you have uploaded your annotation, please notify it to admin@pubannotation.org admin@pubannotation.org, so that it can be included in this collection, which will make your annotation much easily findable. Note that as the CORD-19 dataset grows, the documents in this collection also will be updated. IMPORTANT: CORD-19 License agreement requires that the dataset must be used for text and data mining only.	2020-04-14
Glycosmos6		This collection contains annotation projects which target all the PubMed abstracts (at the time of January 14, 2022) from the 6 glycobiology-related journals: Glycobiology Glycoconjugate journal The Journal of biological chemistry Journal of proteome research Journal of proteomics Carbohydrate research	2023-11-16

Projects

Name	T	Description	# Ann.	Updated at	Status

« 1 2 ... 11 12 13 14 15 16 » 141-150 / 159 show all
bionlp-st-ge-2016-test-proteins		Protein annotations to the benchmark test data set of the BioNLP-ST 2016 GE task. A participant of the GE task may import the documents and annotations of this project to his/her own project, to begin with producing event annotations. For more details, please refer to the benchmark test data set (bionlp-st-ge-2016-test).	4.34 K	2023-11-27	Released
LitCovid-sentences-v1		Sentence segmentation of all the texts in the LitCovid literature. The segmentation is automatically obtained using the TextSentencer annotation service developed and maintained by DBCLS.	16.5 K	2023-11-27	Released
CORD-19-PD-UBERON		PubDictionaries annotation for UBERON terms - updated at 2020-04-30 It is disease term annotation based on Uberon. The terms in Uberon are uploaded in PubDictionaries (Uberon), with which the annotations in this project are produced. The parameter configuration used for this project is here. Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved.	1.42 M	2023-11-24	Released
LitCovid-PD-MONDO-v1		PubDictionaries annotation for disease terms - updated at 2020-04-20 It is based on MONDO Version 2020-04-20. The terms in MONDO are loaded in PubDictionaries, with which the annotations in this project are produced. The parameter configuration used for this project is here. Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved.	13.4 K	2023-11-29	Released
bionlp-st-ge-2016-test		It is the benchmark test data set of the BioNLP-ST 2016 GE task. It includes Genia-style event annotations to 14 full paper articles which are about NFκB proteins. For testing purpose, however, annotations are all blinded, which means users cannot see the annotations in this project. Instead, annotations in any other project can be compared to the hidden annotations in this project, then the annotations in the project will be automatically evaluated based on the comparison. A participant of GE task can get the evaluation of his/her result of automatic annotation, through following process: Create a new project. Import documents from the project, bionlp-st-2016-test-proteins to your project. Import annotations from the project, bionlp-st-2016-test-proteins to your project. At this point, you may want to compare you project to this project, the benchmark data set. It will show that protein annotations in your project is 100% correct, but other annotations, e.g., events, are 0%. Produce event annotations, using your system, upon the protein annotations. Upload your event annotations to your project. Compare your project to this project, to get evaluation. GE 2016 benchmark data set is provided as multi-layer annotations which include: bionlp-st-ge-2016-reference: benchmark reference data set bionlp-st-ge-2016-test: benchmark test data set (this project) bionlp-st-ge-2016-test-proteins: protein annotation to the benchmark test data set Following is supporting resources: bionlp-st-ge-2016-coref: coreference annotation bionlp-st-ge-2016-uniprot: Protein annotation with UniProt IDs. pmc-enju-pas: dependency parsing result produced by Enju UBERON-AE: annotation for anatomical entities as defined in UBERON ICD10: annotation for disease names as defined in ICD10 GO-BP: annotation for biological process names as defined in GO GO-CC: annotation for cellular component names as defined in GO A SPARQL-driven search interface is provided at http://bionlp.dbcls.jp/sparql.	7.99 K	2023-11-29	Released
bionlp-st-ge-2016-coref		Coreference annotation to the benchmark data set (reference and test) of BioNLP-ST 2016 GE task. For detailed information, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test).	853	2023-11-28	Released
CORD-19_Custom_license_subset		The Custom license subset of the CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.	5.08 M	2023-11-24	Released
CORD-19_Non-commercial_use_subset		The Non commercial use subset of the CORD-19 dataset. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.	0	2023-11-29	Released
CORD-19_bioRxiv_medRxiv_subset		The bioRxiv/medRxiv subset of the CORD-19 dataset: pre-prints that are not peer reviewed. The documents in this project will be updated as the CORD-19 dataset grows. See the COVID DATASET LICENSE AGREEMENT.	0	2023-11-29	Released
GlyCosmos600-docs		A random collection of 600 PubMed abstracts from 6 glycobiology-related journals: Glycobiology, Glycoconjugate journal, The Journal of biological chemistry, Journal of proteome research, Journal of proteomics, and Carbohydrate research. The whole PMIDs were collected on June 11, 2019. From each journal, 100 PMIDs were randomly sampled.	0	2023-11-29	Released

Automatic annotators

Name	Description

1 2 3 4 » 1-10 / 38 show all
PubTator-Chemical	To pull the pre-computed chemical annotation from PubTator.
PubTator-Gene	To pull the pre-computed gene annotation from PubTator.
PubTator-Species	To pull the pre-computed Species annotation from PubTator.
TextSentencer	sentence segmentation
PubTator-Disease	To pull the pre-computed disease annotation from PubTator.
PubTator-Mutation	To pull the pre-computed mutation annotation from PubTator.
discourse-simplifier	A discourse analyzer developed by Univ. Manchester.
PD-NGLY1-deficiency-B	A batch annotator for NGLY1 deficiency
PD-UBERON-AE	It annotates for anatomical entities, based on the UBERON-AE dictionary on PubDictionaries. Threshold is set to 0.85.
PD-MONDO	PubDictionaries annotation with the MONDO dictionary.

Editors

Name	Description

1-2 / 2
TextAE-old	TextAE version 4, which was the latest stable version until Apr. 19, 2020.
TextAE	TextAE version 5, which enables edition of attributes of denotations.