The goal of this work is to identify epistemic statements in the scientific literature. An epistemic statement is a statement of unknowns, hypotheses, speculations, uncertainties, including statements of claims, hypotheses, questions, explanations, future opportunities, surprises, issues, or concerns within a sentence. The unit of an epistemic statement is a sentence automatically parsed. The classification is binary - epistemic statement or not. We will label epistemic statements only and one can assume that if a statement is not labeled, then it is not an epistemic statement.
The classifier is a CRF, trained on gold standard annotations of epistemic statements that are currently ongoing. We report an F-measure of 0.91 after 5-fold cross validation on a test set with 914 statements and an F-measure of 0.9 on a held out document with 130 statements. This project is still under development and is submitted to be used for the CovidLit project and associated Hackathon.
Please contact Mayla if you have any questions.