Projects

Name	Description	# Ann.	Author	Maintainer	Updated_at	Status

« 1 2 3 4 5 6 7 8 9 10 ... 15 16 » 101-120 / 316 show all
bionlp-st-ge-2016-coref	Coreference annotation to the benchmark data set (reference and test) of BioNLP-ST 2016 GE task. For detailed information, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test).	853	DBCLS	Jin-Dong Kim	2023-11-28	Released
bionlp-st-ge-2016-test	It is the benchmark test data set of the BioNLP-ST 2016 GE task. It includes Genia-style event annotations to 14 full paper articles which are about NFκB proteins. For testing purpose, however, annotations are all blinded, which means users cannot see the annotations in this project. Instead, annotations in any other project can be compared to the hidden annotations in this project, then the annotations in the project will be automatically evaluated based on the comparison. A participant of GE task can get the evaluation of his/her result of automatic annotation, through following process: Create a new project. Import documents from the project, bionlp-st-2016-test-proteins to your project. Import annotations from the project, bionlp-st-2016-test-proteins to your project. At this point, you may want to compare you project to this project, the benchmark data set. It will show that protein annotations in your project is 100% correct, but other annotations, e.g., events, are 0%. Produce event annotations, using your system, upon the protein annotations. Upload your event annotations to your project. Compare your project to this project, to get evaluation. GE 2016 benchmark data set is provided as multi-layer annotations which include: bionlp-st-ge-2016-reference: benchmark reference data set bionlp-st-ge-2016-test: benchmark test data set (this project) bionlp-st-ge-2016-test-proteins: protein annotation to the benchmark test data set Following is supporting resources: bionlp-st-ge-2016-coref: coreference annotation bionlp-st-ge-2016-uniprot: Protein annotation with UniProt IDs. pmc-enju-pas: dependency parsing result produced by Enju UBERON-AE: annotation for anatomical entities as defined in UBERON ICD10: annotation for disease names as defined in ICD10 GO-BP: annotation for biological process names as defined in GO GO-CC: annotation for cellular component names as defined in GO A SPARQL-driven search interface is provided at http://bionlp.dbcls.jp/sparql.	7.99 K	DBCLS	Jin-Dong Kim	2023-11-29	Released
semrep-sample	Sample annotation of SemRep, produced by Rindflesch, et al. Rindflesch, T.C. and Fiszman, M. (2003). The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. Journal of Biomedical Informatics, 36(6):462-477.	11.1 K	Rindflesch et al.	Jin-Dong Kim	2023-11-29	Testing
LitCovid-manual		540		Jin-Dong Kim	2023-11-30	Developing
epi-statement-test		2		Jin-Dong Kim	2023-11-30	Testing
DocumentLevelAnnotationSample	A sample project for document level annotation	47		Jin-Dong Kim	2023-11-29	Testing
CHEMDNER-training-test	The training subset of the CHEMDNER corpus	29.4 K	Martin Krallinger et al.	Jin-Dong Kim	2023-11-27	Testing
GlyCosmos-GlycanStructure-c		0		Jin-Dong Kim	2023-11-29	Testing
KoreanFN-example	Korean FrameNet example	6		Jin-Dong Kim	2023-11-29	Developing
GO-BP	Annotation for biological processes as defined in the "Biological Process" subset of Gene Ontology	35.4 K	DBCLS	Jin-Dong Kim	2023-11-29	Developing
test10		212		Jin-Dong Kim	2023-11-24
GlyCosmos600-CLO		1.73 K		Jin-Dong Kim	2023-11-28	Testing
example2		12		Jin-Dong Kim	2023-11-29	Testing
GlyCosmos600-FMA		7.12 K		Jin-Dong Kim	2023-11-29
LitCoin-entities		13.6 K		Jin-Dong Kim	2023-11-29	Testing
bionlp-st-ge-2016-reference	It is the benchmark reference data set of the BioNLP-ST 2016 GE task. It includes Genia-style event annotations to 20 full paper articles which are about NFκB proteins. The task is to develop an automatic annotation system which can produce annotation similar to the annotation in this data set as much as possible. For evaluation of the performance of a participating system, the system needs to produce annotations to the documents in the benchmark test data set (bionlp-st-ge-2016-test). GE 2016 benchmark data set is provided as multi-layer annotations which include: bionlp-st-ge-2016-reference: benchmark reference data set (this project) bionlp-st-ge-2016-test: benchmark test data set (annotations are blined) bionlp-st-ge-2016-test-proteins: protein annotation to the benchmark test data set Following is supporting resources: bionlp-st-ge-2016-coref: coreference annotation bionlp-st-ge-2016-uniprot: Protein annotation with UniProt IDs. pmc-enju-pas: dependency parsing result produced by Enju UBERON-AE: annotation for anatomical entities as defined in UBERON ICD10: annotation for disease names as defined in ICD10 GO-BP: annotation for biological process names as defined in GO GO-CC: annotation for cellular component names as defined in GO A SPARQL-driven search interface is provided at http://bionlp.dbcls.jp/sparql.	14.4 K	DBCLS	Jin-Dong Kim	2023-11-29	Released
pubmed-sentences-benchmark	A benchmark data for text segmentation into sentences. The source of annotation is the GENIA treebank v1.0. Following is the process taken. began with the GENIA treebank v1.0. sentence annotations were extracted and converted to PubAnnotation JSON. uploaded. 12 abstracts met alignment failure. among the 12 failure cases, 4 had a dot('.') character where there should be colon (':'). They were manually fixed then successfully uploaded: 7903907, 8053950, 8508358, 9415639. among the 12 failed abstracts, 8 were "250 word truncation" cases. They were manually fixed and successfully uploaded. During the fixing, manual annotations were added for the missing pieces of text. 30 abstracts had extra text in the end, indicating copyright statement, e.g., "Copyright 1998 Academic Press." They were annotated as a sentence in GTB. However, the text did not exist anymore in PubMed. Therefore, the extra texts were removed, together with the sentence annotation to them.	18.4 K	GENIA project	Jin-Dong Kim	2023-11-28	Released
LitCovid-sample-docs	A comprehensive literature resource on the subject of Covid-19 is collected by NCBI: https://www.ncbi.nlm.nih.gov/research/coronavirus/ The LitCovid project@PubAnnotation is a collection of the titles and abstracts of the LitCovid dataset, for the people who want to perform text mining analysis. Please note that if you produce some annotation to the documents in this project, and contribute the annotation back to PubAnnotation, it will become publicly available together with contribution from other people. If you want to contribute your annotation to PubAnnotation, please refer to the documentation page: http://www.pubannotation.org/docs/submit-annotation/ The list of the PMID is sourced from here Below is a notice from the original LitCovid dataset: PUBLIC DOMAIN NOTICE National Center for Biotechnology Information This software/database is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part of the author's official duties as a United States Government employee and thus cannot be copyrighted. This software/database is freely available to the public for use. The National Library of Medicine and the U.S. Government have not placed any restriction on its use or reproduction. Although all reasonable efforts have been taken to ensure the accuracy and reliability of the software and data, the NLM and the U.S. Government do not and cannot warrant the performance or results that may be obtained by using this software or data. The NLM and the U.S. Government disclaim all warranties, express or implied, including warranties of performance, merchantability or fitness for any particular purpose. Please cite the authors in any work or product based on this material : Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193	0		Jin-Dong Kim	2023-11-29	Uploading
JF-test2		0		johanf	2020-03-26	Testing
JF-test	A test corpus for exploring this service	9	Johan Frid	johanf	2023-12-03	Testing

Name	# Ann.	Author	Maintainer	Updated_at	Status

« 1 2 3 4 5 6 7 8 9 10 ... 15 16 » 101-120 / 316 show all
bionlp-st-ge-2016-coref	853	DBCLS	Jin-Dong Kim	2023-11-28	Released
bionlp-st-ge-2016-test	7.99 K	DBCLS	Jin-Dong Kim	2023-11-29	Released
semrep-sample	11.1 K	Rindflesch et al.	Jin-Dong Kim	2023-11-29	Testing
LitCovid-manual	540		Jin-Dong Kim	2023-11-30	Developing
epi-statement-test	2		Jin-Dong Kim	2023-11-30	Testing
DocumentLevelAnnotationSample	47		Jin-Dong Kim	2023-11-29	Testing
CHEMDNER-training-test	29.4 K	Martin Krallinger et al.	Jin-Dong Kim	2023-11-27	Testing
GlyCosmos-GlycanStructure-c	0		Jin-Dong Kim	2023-11-29	Testing
KoreanFN-example	6		Jin-Dong Kim	2023-11-29	Developing
GO-BP	35.4 K	DBCLS	Jin-Dong Kim	2023-11-29	Developing
test10	212		Jin-Dong Kim	2023-11-24
GlyCosmos600-CLO	1.73 K		Jin-Dong Kim	2023-11-28	Testing
example2	12		Jin-Dong Kim	2023-11-29	Testing
GlyCosmos600-FMA	7.12 K		Jin-Dong Kim	2023-11-29
LitCoin-entities	13.6 K		Jin-Dong Kim	2023-11-29	Testing
bionlp-st-ge-2016-reference	14.4 K	DBCLS	Jin-Dong Kim	2023-11-29	Released
pubmed-sentences-benchmark	18.4 K	GENIA project	Jin-Dong Kim	2023-11-28	Released
LitCovid-sample-docs	0		Jin-Dong Kim	2023-11-29	Uploading
JF-test2	0		johanf	2020-03-26	Testing
JF-test	9	Johan Frid	johanf	2023-12-03	Testing