PubAnnotation

> top > users > Yue Wang

Yue Wang

User info

Collections

Name		Description	Updated at

1-3 / 3
AnEM		the largest manually annotated corpus on anatomical entities	2019-04-03
DisGeNET5		Associations obtained by text mining MEDLINE abstracts using the BeFree system	2019-03-11
PIR		Protein Information Resource (PIR)	2019-03-12

Projects

Name	T	Description	# Ann.	Updated at	Status

1 2 3 » 1-10 / 25 show all
0mytest			144	2023-11-29
bionlp-st-pc-2013-training		The training dataset from the pathway curation (PC) task in the BioNLP Shared Task 2013. The entity types defined in the PC task are simple chemical, gene or gene product, complex and cellular component.	7.86 K	2023-11-27	Released
AIMed		The AIMed corpus is one of the most widely used corpora for protein-protein interaction extraction. The protein annotations are either parts of the protein interaction annotations, or are uninvolved in any protein interaction annotation. Publication: http://www.cs.utexas.edu/~ml/papers/bionlp-aimed-04.pdf	4.04 K	2023-11-27	Testing
SCAI-Test		A small corpus for the evaluation of dictionaries containing chemical entities. Publication: http://www.scai.fraunhofer.de/fileadmin/images/bio/data_mining/paper/kolarik2008.pdf Original source: https://www.scai.fraunhofer.de/en/business-research-areas/bioinformatics/downloads/corpora-for-chemical-entity-recognition.html	1.21 K	2023-11-28	Released
bionlp-st-epi-2011-training		The training dataset from the Epigenetics and Post-translational Modifications (EPI) task in the BioNLP Shared Task 2011. The core entities of the task are genes and gene products (RNA and proteins), identified in the data simply as "Protein" annotations.	7.59 K	2023-11-29	Released
DisGeNET5_variant_disease		The file contains variant-disease associations obtained by text mining MEDLINE abstracts using the BeFree system, including the variant and disease off sets.	144 K	2023-11-24	Released
OryzaGP		A dataset for Named Entity Recognition for rice gene	29.1 K	2023-11-24	Uploading
PIR-corpus1		The Protein Information Resource (PIR) is not biased towards any particular biomedical domain, and is expected to provide more diverse protein names in a given sample size. Annotation category: protein, compound-protein, acronym.	4.44 K	2023-11-27	Released
funRiceGenes-exact			841	2023-11-28	Developing
2_test			145 M	2023-11-24

Automatic annotators

Name	Description

1-2 / 2
PTO-all
PTO-exact

Editors

none