PubAnnotation

LASIGE: Annotating a multilingual COVID-19-related corpus for BLAH7

Collection info

The global motivation is the creation of parallel multilingual datasets for text mining systems in COVID-19-related literature. Tracking the most recent advances in the COVID-19-related research is essential given the novelty of the disease and its impact on society. Still, the pace of publication requires automatic approaches to access and organize the knowledge that keeps being produced every day. It is necessary to develop text mining pipelines to assist in that task, which is only possible with evaluation datasets. However, there is a lack of COVID-19-related datasets, even more, if considering other languages besides English. The expected contribution of the project will be the annotation of a multilingual parallel dataset (EN-PT), providing this resource to the community to improve the text mining research on COVID-19-related literature.

Maintainer	dpavot

Projects

Name	Description	# Ann.	Maintainer	Updated_at	RDFized_at	Status

1-6 / 6
ENG_NER_NEL	Annotations in COVID-19 related PubMed abstracts from the following ontologies: Disease Ontology ("do"), Gene Ontology ("go"), Human Phenotype Ontology ("hpo"), ChEBI ontology ("chebi"), MeSH	493	pruas_18	2021-01-20	-	Developing
ENG_NER_NEL_CONSENSUS		607	dpavot	2021-02-15	-	Developing
ENG_RE	Entities and relations annotations from the following ontologies: Disease Ontology ('DO'), Gene Ontology ('GO'), Human Phenotype Ontology ('HPO'), and ChEBI ontology ('CHEBI').	224	dpavot	2021-01-20	-	Developing
ENG_RE_CONSENSUS		250	dpavot	2021-02-15	-
PT_NER_NEL	Annotations in Portuguese COVID-19 related abstracts from MeSH terminology	245	pruas_18	2021-01-20	-	Developing
PT_NER_NEL_CONSENSUS		354	dpavot	2021-02-15	-