CORD-19_bioRxiv_medRxiv_subset | | The bioRxiv/medRxiv subset of the CORD-19 dataset: pre-prints that are not peer reviewed.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT.
| 0 | 2023-11-29 | Released | |
bionlp-st-ge-2016-test-proteins | | Protein annotations to the benchmark test data set of the BioNLP-ST 2016 GE task.
A participant of the GE task may import the documents and annotations of this project to his/her own project, to begin with producing event annotations.
For more details, please refer to the benchmark test data set (bionlp-st-ge-2016-test).
| 4.34 K | 2023-11-27 | Released | |
LitCovid-PD-FMA-UBERON-v1 | | PubDictionaries annotation for anatomy terms - updated at 2020-04-20
Disease term annotation based on FMA and Uberon. Version 2020-04-20.
The terms in FMA and Uberon are loaded in PubDictionaries
(FMA and
Uberon), with which the annotations in this project are produced.
The parameter configuration used for this project is
here for FMA and
there for Uberon.
Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved. | 4.3 K | 2023-11-27 | Released | |
GlyCosmos600-docs | | A random collection of 600 PubMed abstracts from 6 glycobiology-related journals: Glycobiology, Glycoconjugate journal, The Journal of biological chemistry, Journal of proteome research, Journal of proteomics, and Carbohydrate research. The whole PMIDs were collected on June 11, 2019. From each journal, 100 PMIDs were randomly sampled. | 0 | 2023-11-29 | Released | |
LitCovid-v1-docs | | A comprehensive literature resource on the subject of Covid-19 is collected by NCBI:
https://www.ncbi.nlm.nih.gov/research/coronavirus/
The LitCovid project@PubAnnotation is a collection of the titles and abstracts of the LitCovid dataset, for the people who want to perform text mining analysis. Please note that if you produce some annotation to the documents in this project, and contribute the annotation back to PubAnnotation, it will become publicly available together with contribution from other people.
If you want to contribute your annotation to PubAnnotation, please refer to the documentation page:
http://www.pubannotation.org/docs/submit-annotation/
The list of the PMID is sourced from here
The 6 entries of the following PMIDs could not be included because they were not available from PubMed:32161394,
32104909,
32090470,
32076224,
32161394
32188956,
32238946.
Below is a notice from the original LitCovid dataset:
PUBLIC DOMAIN NOTICE
National Center for Biotechnology Information
This software/database is a "United States Government Work" under the
terms of the United States Copyright Act. It was written as part of
the author's official duties as a United States Government employee and
thus cannot be copyrighted. This software/database is freely available
to the public for use. The National Library of Medicine and the U.S.
Government have not placed any restriction on its use or reproduction.
Although all reasonable efforts have been taken to ensure the accuracy
and reliability of the software and data, the NLM and the U.S.
Government do not and cannot warrant the performance or results that
may be obtained by using this software or data. The NLM and the U.S.
Government disclaim all warranties, express or implied, including
warranties of performance, merchantability or fitness for any particular
purpose.
Please cite the authors in any work or product based on this material :
Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193
| 0 | 2023-11-29 | Released | |
CORD-19_Non-commercial_use_subset | | The Non commercial use subset of the CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 0 | 2023-11-29 | Released | |
CORD-19-PD-UBERON | | PubDictionaries annotation for UBERON terms - updated at 2020-04-30
It is disease term annotation based on Uberon.
The terms in Uberon are uploaded in PubDictionaries
(Uberon), with which the annotations in this project are produced.
The parameter configuration used for this project is
here.
Note that it is an automatically generated dictionary-based annotation. It will be updated periodically, as the documents are increased, and the dictionary is improved. | 1.42 M | 2023-11-24 | Released | |
CORD-19_Custom_license_subset | | The Custom license subset of the CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 5.08 M | 2023-11-24 | Released | |
LitCovid-sentences-v1 | | Sentence segmentation of all the texts in the LitCovid literature. The segmentation is automatically obtained using the TextSentencer annotation service developed and maintained by DBCLS. | 16.5 K | 2023-11-27 | Released | |
CORD-19_All_docs | | All the documents in the whole CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 0 | 2023-11-29 | Released | |