Projects

Name	Description	# Ann.	Author	Maintainer	Updated_at	Status

« 1 2 ... 22 23 24 25 26 27 28 29 30 » 501-520 / 590 show all
GlyCosmos15-Glycan		21.4 K		Jin-Dong Kim	2024-09-19	Developing
uniprot-human	Uniprot proteins for human	21.8 K	Jin-Dong Kim	Jin-Dong Kim	2023-11-29	Testing
PennBioIE	The PennBioIE corpus (0.9) covers two domains of biomedical knowledge. One is the inhibition of the cytochrome P450 family of enzymes (CYP450 or CYP for short) , and the other domain is the molecular genetics of dance (oncology or onco for short).	23.8 K	UPenn Biomedical Information Extraction Project	Yue Wang	2023-11-26	Released
Goldhamster2_Cellosaurus		27.5 K		zebet	2023-11-29	Developing
Glycosmos15-GlycoEpitope		27.8 K		Jin-Dong Kim	2024-09-18	Developing
ASCO_abstracts	asco abstracts sample dataset	28 K		alo33	2023-11-29	Testing
OryzaGP	A dataset for Named Entity Recognition for rice gene	29.1 K	Huy Do and Pierre Larmande	Yue Wang	2023-11-24	Uploading
CHEMDNER-training-test	The training subset of the CHEMDNER corpus	29.4 K	Martin Krallinger et al.	Jin-Dong Kim	2023-11-27	Testing
GlycoBiology-NCBITAXON	NCBITAXON-based annotation to GlycoBiology abstracts	32.7 K		shuo50	2023-11-29	Testing
Genomics_Informatics	Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining.	35.3 K	Hyun-Seok Park	ewha-bio	2023-11-29	Beta
GO-BP	Annotation for biological processes as defined in the "Biological Process" subset of Gene Ontology	35.4 K	DBCLS	Jin-Dong Kim	2023-11-29	Developing
biosemtest	test submitting Peregrine annotations	35.6 K	Mark Thompson	markthompson	2023-11-29	Testing
FirstAuthor_s	新着論文レビューで疾患名のあるレビューにおけるUniProtIDと薬剤等化合物日化辞ID (Japanese)	39.3 K		AikoHIRAKI	2023-11-29	Developing
OryzaGP_2022		41.3 K		larmande	2023-11-24
genia-medco-coref	Coreference annotation made to the Genia corpus, following the MUC annotation scheme. It is a product of the collaboration between the Genia and the MedCo projects.	45.9 K	MedCo project & Genia project	Jin-Dong Kim	2023-11-24	Developing
jnlpba-st-training	The training data used in the task came from the GENIA version 3.02 corpus, This was formed from a controlled search on MEDLINE using the MeSH terms "human", "blood cells" and "transcription factors". From this search, 1,999 abstracts were selected and hand annotated according to a small taxonomy of 48 classes based on a chemical classification. Among the classes, 36 terminal classes were used to annotate the GENIA corpus. For the shared task only the classes protein, DNA, RNA, cell line and cell type were used. The first three incorporate several subclasses from the original taxonomy while the last two are interesting in order to make the task realistic for post-processing by a potential template filling application. The publication year of the training set ranges over 1990~1999.	51.1 K	GENIA	Yue Wang	2023-11-26	Released
Preeclampsia	A collection of titles and abstracts of "Preeclampsia"-related papers. They were extracted from PubMed using the MeSH term "Preeclampsia" and specifying the language to be "English, on 11th September, 2017. The texts were then annotated by PubDictionaries using the dictionary "Preeclampsia".	58.7 K		callahan_tiff	2023-11-29	Developing
PMA_MER	PMAs annotated using MERpy.	58.9 K	Stefano Rensi	therightstef	2023-11-29	Developing
FSU-PRGE	A new broad-coverage corpus composed of 3,306 MEDLINE abstracts dealing with gene and protein mentions. The annotation process was semi-automatic. Publication: http://aclweb.org/anthology/W/W10/W10-1838.pdf	59.5 K	CALBC Project	Yue Wang	2023-11-26	Released
craft-ca-core-dev	Development data for CRAFT CA shared task, core concepts only. This project contains the development (training) annotations for the Concept Annotation task of the CRAFT Shared Task 2019. This particular set of concept annotations is the "core" set. See the task description for details, but this set contains only annotations to concepts that appear in the original 10 Open Biomedical Ontologies used for annotation. (That is to say, it does not contain any annotations to extension classes).	59.8 K	University of Colorado Anschutz Medical Campus	craft-st	2023-11-29	Released

Name	# Ann.	Author	Maintainer	Updated_at	Status

« 1 2 ... 22 23 24 25 26 27 28 29 30 » 501-520 / 590 show all
GlyCosmos15-Glycan	21.4 K		Jin-Dong Kim	2024-09-19	Developing
uniprot-human	21.8 K	Jin-Dong Kim	Jin-Dong Kim	2023-11-29	Testing
PennBioIE	23.8 K	UPenn Biomedical Information Extraction Project	Yue Wang	2023-11-26	Released
Goldhamster2_Cellosaurus	27.5 K		zebet	2023-11-29	Developing
Glycosmos15-GlycoEpitope	27.8 K		Jin-Dong Kim	2024-09-18	Developing
ASCO_abstracts	28 K		alo33	2023-11-29	Testing
OryzaGP	29.1 K	Huy Do and Pierre Larmande	Yue Wang	2023-11-24	Uploading
CHEMDNER-training-test	29.4 K	Martin Krallinger et al.	Jin-Dong Kim	2023-11-27	Testing
GlycoBiology-NCBITAXON	32.7 K		shuo50	2023-11-29	Testing
Genomics_Informatics	35.3 K	Hyun-Seok Park	ewha-bio	2023-11-29	Beta
GO-BP	35.4 K	DBCLS	Jin-Dong Kim	2023-11-29	Developing
biosemtest	35.6 K	Mark Thompson	markthompson	2023-11-29	Testing
FirstAuthor_s	39.3 K		AikoHIRAKI	2023-11-29	Developing
OryzaGP_2022	41.3 K		larmande	2023-11-24
genia-medco-coref	45.9 K	MedCo project & Genia project	Jin-Dong Kim	2023-11-24	Developing
jnlpba-st-training	51.1 K	GENIA	Yue Wang	2023-11-26	Released
Preeclampsia	58.7 K		callahan_tiff	2023-11-29	Developing
PMA_MER	58.9 K	Stefano Rensi	therightstef	2023-11-29	Developing
FSU-PRGE	59.5 K	CALBC Project	Yue Wang	2023-11-26	Released
craft-ca-core-dev	59.8 K	University of Colorado Anschutz Medical Campus	craft-st	2023-11-29	Released