PA-LLM | | 🖐️ LLMs for biomedical text summarisation | 0 | | Nico Colic | 2024-01-19 | Developing | |
PMA_Manual | | Manually annotated examples of medical device PMA approval statements | 204 | Stefano Rensi | therightstef | 2023-11-27 | Developing | |
LitCoin-Disease-MeSH | | MeSH C and F03, plus false negatives that appear in two or more documents in LitCoin-entities | 3.56 K | | yucca | 2023-11-29 | Testing | |
GENIAcorpus | | multi_cell (1,782)
mono_cell (222)
virus (2,136)
protein_family_or_group (8,002)
protein_complex (2,394)
protein_molecule (21,290)
protein_subunit (942)
protein_substructure (129)
protein_domain_or_region (1,044)
protein_other (97)
peptide (521)
amino_acid_monomer (784)
DNA_family_or_group (332)
DNA_molecule (664)
DNA_substructure (2)
DNA_domain_or_region (39)
DNA_other (16)
RNA_family_or_group (1,545)
RNA_molecule (554)
RNA_substructure (106)
RNA_domain_or_region (8,237)
RNA_other (48)
polynucleotide (259)
nucleotide (243)
lipid (2,375)
carbohydrate (99)
other_organic_compound (4,113)
body_part (461)
tissue (706)
cell_type (7,473)
cell_component (679)
cell_line (4,129)
other_artificial_source (211)
inorganic (258)
atom (342)
other (21,056)
| 78.9 K | GENIA Project | Yue Wang | 2023-11-29 | Released | |
bayaba | | nalee | 7 | | Nakyolee | 2023-11-29 | | |
LitCovid-PubTatorCentral | | Named-entities for the documents in the LitCovid dataset. Annotations were automatically predicted by the PubTatorCentral tool (https://www.ncbi.nlm.nih.gov/research/pubtator/) | 4.64 K | | zebet | 2023-11-27 | Released | |
GlycoBiology-NCBITAXON | | NCBITAXON-based annotation to GlycoBiology abstracts | 32.7 K | | shuo50 | 2023-11-29 | Testing | |
Biotea | | NCBO annotation on full text for PMC articles. Currently including only a small set of 2811 articles corresponding to those supporting curated diesease-protein annotation from UniProt and with machine-processable full text. | 894 K | L. Garcia | | 2023-11-24 | Developing | |
bionlp-st-ge-2016-test-tees | | NER and event extraction produced by TEES (with the default GE11 model) for the 14 full papers used in the BioNLP 2016 GE task test corpus. | 9.17 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
bionlp-st-ge-2016-reference-tees | | NER and event extraction produced by TEES (with the default GE11 model) for the 20 full papers used in the BioNLP 2016 GE task reference corpus. | 14.6 K | Nico Colic | Nico Colic | 2023-11-29 | Released | |
EDAN70 | | NLP tagging of articles concerning covid19. | 0 | | fettmedknaoz | 2023-11-29 | | |
tagtog | | OpenAccess annotations coming from tagtog.net | 0 | tagtog | tagtog | 2015-02-23 | Developing | |
PubCasesORDO | | ORDO annotation in PubCases | 865 K | | Toyofumi Fujiwara | 2023-11-24 | Beta | |
OryzaGP_2021_v2 | | OryzaGP_2021_v2 will use a second annotator | 208 K | | larmande | 2023-11-29 | Developing | |
OryzaGP_2020 | | OryzaGP is a dataset of pubmed abstract related to Oryza sativa species | 0 | Pierre Larmande | larmande | 2023-11-29 | Developing | |
Grays_part2 | | Osteology w/o 21 | 8.63 K | | okubo | 2023-11-29 | Testing | |
LitCovid-v1-docs | | A comprehensive literature resource on the subject of Covid-19 is collected by NCBI:
https://www.ncbi.nlm.nih.gov/research/coronavirus/
The LitCovid project@PubAnnotation is a collection of the titles and abstracts of the LitCovid dataset, for the people who want to perform text mining analysis. Please note that if you produce some annotation to the documents in this project, and contribute the annotation back to PubAnnotation, it will become publicly available together with contribution from other people.
If you want to contribute your annotation to PubAnnotation, please refer to the documentation page:
http://www.pubannotation.org/docs/submit-annotation/
The list of the PMID is sourced from here
The 6 entries of the following PMIDs could not be included because they were not available from PubMed:32161394,
32104909,
32090470,
32076224,
32161394
32188956,
32238946.
Below is a notice from the original LitCovid dataset:
PUBLIC DOMAIN NOTICE
National Center for Biotechnology Information
This software/database is a "United States Government Work" under the
terms of the United States Copyright Act. It was written as part of
the author's official duties as a United States Government employee and
thus cannot be copyrighted. This software/database is freely available
to the public for use. The National Library of Medicine and the U.S.
Government have not placed any restriction on its use or reproduction.
Although all reasonable efforts have been taken to ensure the accuracy
and reliability of the software and data, the NLM and the U.S.
Government do not and cannot warrant the performance or results that
may be obtained by using this software or data. The NLM and the U.S.
Government disclaim all warranties, express or implied, including
warranties of performance, merchantability or fitness for any particular
purpose.
Please cite the authors in any work or product based on this material :
Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193
| 0 | | Jin-Dong Kim | 2023-11-29 | Released | |
LitCovid-sample-docs | | A comprehensive literature resource on the subject of Covid-19 is collected by NCBI:
https://www.ncbi.nlm.nih.gov/research/coronavirus/
The LitCovid project@PubAnnotation is a collection of the titles and abstracts of the LitCovid dataset, for the people who want to perform text mining analysis. Please note that if you produce some annotation to the documents in this project, and contribute the annotation back to PubAnnotation, it will become publicly available together with contribution from other people.
If you want to contribute your annotation to PubAnnotation, please refer to the documentation page:
http://www.pubannotation.org/docs/submit-annotation/
The list of the PMID is sourced from here
Below is a notice from the original LitCovid dataset:
PUBLIC DOMAIN NOTICE
National Center for Biotechnology Information
This software/database is a "United States Government Work" under the
terms of the United States Copyright Act. It was written as part of
the author's official duties as a United States Government employee and
thus cannot be copyrighted. This software/database is freely available
to the public for use. The National Library of Medicine and the U.S.
Government have not placed any restriction on its use or reproduction.
Although all reasonable efforts have been taken to ensure the accuracy
and reliability of the software and data, the NLM and the U.S.
Government do not and cannot warrant the performance or results that
may be obtained by using this software or data. The NLM and the U.S.
Government disclaim all warranties, express or implied, including
warranties of performance, merchantability or fitness for any particular
purpose.
Please cite the authors in any work or product based on this material :
Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193
| 0 | | Jin-Dong Kim | 2023-11-29 | Uploading | |
CORD-19_All_docs | | All the documents in the whole CORD-19 dataset.
The documents in this project will be updated as the CORD-19 dataset grows.
See the COVID DATASET LICENSE AGREEMENT. | 0 | | Jin-Dong Kim | 2023-11-29 | Released | |
UBERON-AE | | Annotation for anatomical entities based on the "Anatomical Entity" subtree of UBERON ontology.
Annotations are automatically produced using PubDictionaries with threshold: 0.85. | 859 K | DBCLS | Jin-Dong Kim | 2023-11-29 | Developing | |