UseCases_PubTatorCentral | | Predictions from PubTator Central (https://www.ncbi.nlm.nih.gov/research/pubtator/) for the seven datasets and for four entity types (disease,chemical,species,cellline) | 0 | | zebet | 2023-11-29 | Developing | |
AlvisNLP-Test | | Project for testing AlviNLP PubAnnotation server during BLAH3. | 17 | | Bibliome | 2023-11-29 | Testing | |
Trait curation | | Project for trait curation in PGDBj | 479 | Sachiko Shirasawa | Sachiko Shirasawa | 2023-11-24 | Testing | |
LappsTest | | Project to test posting annotations directly from the Language Applications Grid | 2.67 K | Keith Suderman | ksuderman | 2023-11-27 | Developing | |
uniprot-mouse | | Protein annotation based on UniProt | 11.5 K | | Jin-Dong Kim | 2023-11-28 | Developing | |
LitCovid-docs | | Updated at 2021-01-12
A comprehensive literature resource on the subject of Covid-19 is collected by NCBI:
https://www.ncbi.nlm.nih.gov/research/coronavirus/
The LitCovid project@PubAnnotation is a collection of the titles and abstracts of the LitCovid dataset, for the people who want to perform text mining analysis. Please note that if you produce some annotation to the documents in this project, and contribute the annotation back to PubAnnotation, it will become publicly available together with contribution from other people.
If you want to contribute your annotation to PubAnnotation, please refer to the documentation page:
http://www.pubannotation.org/docs/submit-annotation/
The list of the PMID is sourced from here
The 6 entries of the following PMIDs could not be included because they were not available from PubMed:32161394,
32104909,
32090470,
32076224,
32161394
32188956,
32238946.
Below is a notice from the original LitCovid dataset:
PUBLIC DOMAIN NOTICE
National Center for Biotechnology Information
This software/database is a "United States Government Work" under the
terms of the United States Copyright Act. It was written as part of
the author's official duties as a United States Government employee and
thus cannot be copyrighted. This software/database is freely available
to the public for use. The National Library of Medicine and the U.S.
Government have not placed any restriction on its use or reproduction.
Although all reasonable efforts have been taken to ensure the accuracy
and reliability of the software and data, the NLM and the U.S.
Government do not and cannot warrant the performance or results that
may be obtained by using this software or data. The NLM and the U.S.
Government disclaim all warranties, express or implied, including
warranties of performance, merchantability or fitness for any particular
purpose.
Please cite the authors in any work or product based on this material :
Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193
| 18 | | Jin-Dong Kim | 2023-11-28 | Testing | |
CORD-19_bioRxiv_medRxiv_subset | | The bioRxiv/medRxiv subset of the CORD-19 dataset: pre-prints that are not peer reviewed.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT.
| 0 | | Jin-Dong Kim | 2023-11-29 | Released | |
CORD-19_Commercial_use_subset | | The Commercial use subset of the CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 0 | | Jin-Dong Kim | 2023-11-29 | Released | |
CORD-19_Custom_license_subset | | The Custom license subset of the CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 5.08 M | | Jin-Dong Kim | 2023-11-24 | Released | |
CORD-19_Non-commercial_use_subset | | The Non commercial use subset of the CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 0 | | Jin-Dong Kim | 2023-11-29 | Released | |
ykjeong_test | | pub_annotation_test | 276 | | | 2023-11-28 | Testing | |
LitCovid_Glycan-Motif-Structure | | PubDictionaries annotation for glycan-Motif terms. | 6.51 K | | ISSAKU YAMADA | 2023-11-29 | Beta | |
bionlp-st-ge-2016-uniprot | | UniProt protein annotation to the benchmark data set of BioNLP-ST 2016 GE task: reference data set (bionlp-st-ge-2016-reference) and test data set (bionlp-st-ge-2016-test).
The annotations are produced based on a dictionary which is semi-automatically compiled for the 34 full paper articles included in the benchmark data set (20 in the reference data set + 14 in the test data set).
For detailed information about BioNLP-ST GE 2016 task data sets, please refer to the benchmark reference data set (bionlp-st-ge-2016-reference) and benchmark test data set (bionlp-st-ge-2016-test).
| 16.2 K | DBCLS | Jin-Dong Kim | 2023-11-29 | Beta | |
QFMC_MEDLINE | | Quaero French Medical Corpus:
Annotation of MEDLINE titles | 5.9 K | Aurélie Névéol | Pierre Zweigenbaum | 2023-11-29 | Beta | |
tees-test | | Random PMC document used for testing during the development of a RESTful TEES parsing web service. | 3.39 K | Nico Colic | Nico Colic | 2023-11-24 | Developing | |
spacy-test | | Random set of articles used for testing in the development of the RESTful spaCy parsing web service. Since development is now finished, they are released for the community to use. | 131 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
glytoucan-iupac | | retrying glytoucan-iupac annotation as of march 9, 2018 | 0 | | kiyoko | 2023-11-29 | Testing | |
metamap-sample | | Sample annotation of MetaMep, produced by Aronson, et al.
An overview of MetaMap: historical perspective and recent advances, JAMIA 2010 | 10.9 K | Alan R Aronson | Jin-Dong Kim | 2023-11-27 | Testing | |
pubtator-sample | | Sample annotation of PubTator produced by Zhiyong Lu et al. | 28 | Zhiyong Lu | Jin-Dong Kim | 2023-11-27 | Testing | |
semrep-sample | | Sample annotation of SemRep, produced by Rindflesch, et al.
Rindflesch, T.C. and Fiszman, M. (2003). The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. Journal of Biomedical Informatics, 36(6):462-477. | 11.1 K | Rindflesch et al. | Jin-Dong Kim | 2023-11-29 | Testing | |