PMC:7116472 / 5135-17637 JSON TXT

Annnotations TAB JSON ListView MergeView

LitCovid-PubTator

Id	Subject	Object	Predicate	Lexical cue	tao:has_database_id
128	289-300	Species	denotes	Coronavirus	Tax:11118
129	90-100	Disease	denotes	Infections	MESH:D007239
133	866-874	Species	denotes	patients	Tax:9606
134	1178-1185	Species	denotes	patient	Tax:9606
135	1260-1269	Disease	denotes	infection	MESH:D007239
149	1978-1985	Gene	denotes	insulin	Gene:3630
150	2147-2155	Species	denotes	patients	Tax:9606
151	1711-1726	Disease	denotes	cardiac disease	MESH:D006331
152	1728-1755	Disease	denotes	chronic respiratory disease	MESH:D012140
153	1767-1773	Disease	denotes	asthma	MESH:D001249
154	1776-1797	Disease	denotes	chronic renal disease	MESH:D051436
155	1857-1870	Disease	denotes	liver disease	MESH:D008107
156	1872-1880	Disease	denotes	dementia	MESH:D003704
157	1942-1959	Disease	denotes	diabetes mellitus	MESH:D003920
158	1999-2002	Disease	denotes	HIV	MESH:D015658
159	2006-2010	Disease	denotes	AIDS	MESH:D000163
160	2016-2026	Disease	denotes	malignancy	MESH:D009369
161	2211-2220	Disease	denotes	infection	MESH:D007239
165	2389-2397	Species	denotes	patients	Tax:9606
166	2287-2294	Disease	denotes	obesity	MESH:D009765
167	2403-2411	Disease	denotes	covid-19	MESH:C000657245
173	2963-2971	Species	denotes	patients	Tax:9606
174	3094-3102	Species	denotes	patients	Tax:9606
175	3201-3209	Species	denotes	patients	Tax:9606
176	2869-2878	Disease	denotes	mortality	MESH:D003643
177	3010-3030	Disease	denotes	SARS-CoV-2 infection	MESH:C000657245
186	3361-3368	Species	denotes	patient	Tax:9606
187	3660-3667	Species	denotes	patient	Tax:9606
188	3843-3851	Species	denotes	patients	Tax:9606
189	3911-3919	Species	denotes	patients	Tax:9606
190	3492-3500	Disease	denotes	covid-19	MESH:C000657245
191	3721-3730	Disease	denotes	pneumonia	MESH:D011014
192	3735-3750	Disease	denotes	flulike illness	MESH:D002908
193	3857-3865	Disease	denotes	covid-19	MESH:C000657245
197	4101-4108	Species	denotes	patient	Tax:9606
198	4272-4280	Species	denotes	patients	Tax:9606
199	4286-4294	Disease	denotes	covid-19	MESH:C000657245
202	6280-6291	Species	denotes	Coronavirus	Tax:11118
203	6243-6252	Disease	denotes	Mortality	MESH:D003643
205	7086-7095	Disease	denotes	mortality	MESH:D003643
209	8371-8380	Disease	denotes	mortality	MESH:D003643
210	8545-8554	Disease	denotes	mortality	MESH:D003643
211	8639-8648	Disease	denotes	pneumonia	MESH:D011014
215	8852-8860	Species	denotes	Patients	Tax:9606
216	9126-9134	Species	denotes	patients	Tax:9606
217	9233-9241	Species	denotes	patients	Tax:9606
220	9660-9668	Disease	denotes	covid-19	MESH:C000657245
221	10071-10079	Disease	denotes	covid-19	MESH:C000657245
223	10234-10238	Species	denotes	mice	Tax:10090
229	10906-10916	Species	denotes	SARS-CoV-2	Tax:2697049
230	10920-10931	Species	denotes	coronavirus	Tax:11118
231	10858-10867	Disease	denotes	pneumonia	MESH:D011014
232	10871-10877	Disease	denotes	sepsis	MESH:D018805
233	10894-10899	Disease	denotes	COVID	MESH:C000657245
236	11455-11462	Species	denotes	patient	Tax:9606
237	11551-11559	Species	denotes	Patients	Tax:9606
240	12108-12115	Species	denotes	patient	Tax:9606
241	12210-12218	Species	denotes	patients	Tax:9606
243	12246-12253	Species	denotes	Patient	Tax:9606
245	12392-12400	Species	denotes	Patients	Tax:9606

LitCovid-sentences

Id	Subject	Object	Predicate	Lexical cue
T36	0-7	Sentence	denotes	Methods
T37	9-33	Sentence	denotes	Study design and setting
T38	34-245	Sentence	denotes	The International Severe Acute Respiratory and emerging Infections Consortium (ISARIC) World Health Organization (WHO) Clinical Characterisation Protocol UK (CCP-UK) study is an ongoing prospective cohort study.
T39	246-515	Sentence	denotes	The study is being performed by the ISARIC Coronavirus Clinical Characterisation Consortium (ISARIC-4C) in 260 hospitals across England, Scotland, and Wales (National Institute for Health Research Clinical Research Network Central Portfolio Management System ID 14152).
T40	516-819	Sentence	denotes	The protocol and further study details are available online.8 Model development and reporting followed the TRIPOD (transparent reporting of a multivariable prediction model for individual prediction or diagnosis) guidelines.9 The study is being conducted according to a predefined protocol (appendix 1).
T41	821-833	Sentence	denotes	Participants
T42	834-1103	Sentence	denotes	The study recruited consecutive patients aged 18 years and older with a completed index admission to one of 260 hospitals in England, Scotland, or Wales.8 Reverse transcriptase polymerase chain reaction was the only mode of testing available during the period of study.
T43	1104-1215	Sentence	denotes	The decision to test was at the discretion of the clinician attending the patient, and not defined by protocol.
T44	1216-1385	Sentence	denotes	The enrolment criterion “high likelihood of infection” reflected that a preparedness protocol cannot assume a diagnostic test will be available for an emergent pathogen.
T45	1386-1478	Sentence	denotes	In this activation, site training emphasised the importance of only recruiting proven cases.
T46	1480-1495	Sentence	denotes	Data collection
T47	1496-1592	Sentence	denotes	Demographic, clinical, and outcome data were collected by using a prespecified case report form.
T48	1593-2027	Sentence	denotes	Comorbidities were defined according to a modified Charlson comorbidity index.10 Comorbidities collected were chronic cardiac disease, chronic respiratory disease (excluding asthma), chronic renal disease (estimated glomerular filtration rate ≤30), mild to severe liver disease, dementia, chronic neurological conditions, connective tissue disease, diabetes mellitus (diet, tablet, or insulin controlled), HIV or AIDS, and malignancy.
T49	2028-2268	Sentence	denotes	These conditions were selected a priori by a global consortium to provide rapid, coordinated clinical investigation of patients presenting with any severe or potentially severe acute infection of public interest and enabled standardisation.
T50	2269-2676	Sentence	denotes	Clinician defined obesity was also included as a comorbidity owing to its probable association with adverse outcomes in patients with covid-19.1112 The clinical information used to calculate prognostic scores was taken from the day of admission to hospital.13 A practical approach was taken to sample size requirements.14 We used all available data to maximise the power and generalisability of our results.
T51	2677-2822	Sentence	denotes	Model reliability was assessed by using a temporally distinct validation cohort with geographical subsetting, together with sensitivity analyses.
T52	2824-2832	Sentence	denotes	Outcomes
T53	2833-2879	Sentence	denotes	The primary outcome was in-hospital mortality.
T54	2880-3048	Sentence	denotes	This outcome was selected because of the importance of the early identification of patients likely to develop severe illness from SARS-CoV-2 infection (a rule in test).
T55	3049-3247	Sentence	denotes	We chose to restrict analysis of outcomes to patients who were admitted more than four weeks before final data extraction (29 June 2020) to enable most patients to complete their hospital admission.
T56	3249-3280	Sentence	denotes	Independent predictor variables
T57	3281-3949	Sentence	denotes	A reduced set of potential predictor variables was selected a priori, including patient demographic information, common clinical investigations, and parameters consistently identified as clinically important in covid-19 cohorts following the methods described by Wynants and colleagues (appendix 2).5 Candidate predictor variables were selected based on three common criteria15: patient and clinical variables known to influence outcome in pneumonia and flulike illness; clinical biomarkers previously identified within the literature as potential predictors in patients with covid-19; values available for at least two thirds of patients within the derivation cohort.
T58	3950-4209	Sentence	denotes	Because our overall aim was to develop an easy-to-use risk stratification score, we made the decision to include an overall comorbidity count for each patient within model development giving each comorbidity equal weight, rather than individual comorbidities.
T59	4210-4370	Sentence	denotes	Recent evidence suggests an additive effect of comorbidity in patients with covid-19, with increasing number of comorbidities associated with poorer outcomes.16
T60	4372-4389	Sentence	denotes	Model development
T61	4390-4557	Sentence	denotes	Missing values for potential candidate variables were handled by using multiple imputation with chained equations, under the missing at random assumption (appendix 6).
T62	4558-4689	Sentence	denotes	Ten sets, each with 10 iterations, were imputed using available explanatory variables for both cohorts (derivation and validation).
T63	4690-4796	Sentence	denotes	The outcome variable was included as a predictor in the derivation dataset but not the validation dataset.
T64	4797-4913	Sentence	denotes	All model derivation and validation was performed in imputed datasets, with Rubin’s rules17 used to combine results.
T65	4914-4980	Sentence	denotes	Models were trained by using all available data up to 20 May 2020.
T66	4981-5127	Sentence	denotes	The primary intention was to create a pragmatic model for bedside use not requiring complex equations, online calculators, or mobile applications.
T67	5128-5233	Sentence	denotes	An a priori decision was therefore made to categorise continuous variables in the final prognostic score.
T68	5234-5287	Sentence	denotes	We used a three stage model building process (fig 1).
T69	5288-5476	Sentence	denotes	Firstly, generalised additive models were built incorporating continuous smoothed predictors (penalised thin plate splines) in combination with categorical predictors as linear components.
T70	5477-5661	Sentence	denotes	A criterion based approach to variable selection was taken based on the deviance explained, the unbiased risk estimator, and the area under the receiver operating characteristic curve.
T71	5662-5843	Sentence	denotes	Secondly, we visually inspected plots of component smoothed continuous predictors for linearity, and selected optimal cut-off values by using the methods of Barrio and colleagues.18
T72	5844-5981	Sentence	denotes	Lastly, final models using categorised variables were specified with least absolute shrinkage and selection operator logistic regression.
T73	5982-6135	Sentence	denotes	L1 penalised coefficients were derived using 10-fold cross validation to select the value of lambda (minimised cross validated sum of squared residuals).
T74	6136-6330	Sentence	denotes	We converted shrunk coefficients to a prognostic index with appropriate scaling to create the pragmatic 4C Mortality Score (where 4C stands for Coronavirus Clinical Characterisation Consortium).
T75	6331-6420	Sentence	denotes	We used machine learning approaches in parallel for comparison of predictive performance.
T76	6421-6595	Sentence	denotes	Given issues with interpretability, this was intended to provide a best-in-class comparison of predictive performance when accounting for any complex underlying interactions.
T77	6596-6649	Sentence	denotes	Gradient boosting decision trees were used (XGBoost).
T78	6650-6776	Sentence	denotes	All candidate predictor variables identified were included within the model, except for those with high missing values (>33%).
T79	6777-6908	Sentence	denotes	We retained individual major comorbidity variables within the model to determine whether inclusion improved predictive performance.
T80	6909-6998	Sentence	denotes	An 80%/20% random split of the derivation dataset was used to define train and test sets.
T81	6999-7075	Sentence	denotes	The validation datasets were held back and not used in the training process.
T82	7076-7364	Sentence	denotes	We used a mortality label and design matrix of centred or standardised continuous and categorical variables including all candidate variables to train gradient boosted trees minimising the binary classification error rate (defined as number of wrong cases divided by number of all cases).
T83	7365-7532	Sentence	denotes	Hyperparameters were tuned, including the learning rate and maximum tree depth, to maximise the area under the receiver operating characteristic curve in the test set.
T84	7533-7731	Sentence	denotes	This approach affords flexibility in the handling of missing data; therefore, two models were trained and optimised, one using imputed data and the other modelling missingness in complete case data.
T85	7732-7951	Sentence	denotes	We assessed discrimination for all models by using the area under the receiver operating characteristic curve in the derivation cohort, with 95% confidence intervals calculated by bootstrapped resampling (2000 samples).
T86	7952-8167	Sentence	denotes	A value of 0.5 indicates no predictive ability, 0.8 is considered good, and 1.0 is perfect.19 We assessed overall goodness of fit with the Brier score,20 a measure to quantify how close predictions are to the truth.
T87	8168-8259	Sentence	denotes	The score ranges between 0 and 1, where smaller values indicate superior model performance.
T88	8260-8440	Sentence	denotes	We plotted model calibration curves to examine agreement between predicted and observed risk across deciles of mortality risk to determine the presence of over or under prediction.
T89	8441-8680	Sentence	denotes	Risk cut-off values were defined by the total point score for an individual, which represented low (<2% mortality rate), intermediate (2-14.9%), or high risk (≥15%) groups, similar to commonly used pneumonia risk stratification scores.2122
T90	8681-8743	Sentence	denotes	We performed sensitivity analyses by using complete case data.
T91	8744-8833	Sentence	denotes	Model discrimination was also checked in ethnic groups and by sex using imputed datasets.
T92	8835-8851	Sentence	denotes	Model validation
T93	8852-8974	Sentence	denotes	Patients entered into the ISARIC WHO CCP-UK study after 20 May 2020 were included in a separate validation cohort (fig 1).
T94	8975-9080	Sentence	denotes	We determined discrimination, calibration, and performance across a range of clinically relevant metrics.
T95	9081-9220	Sentence	denotes	To avoid bias in the assessment of outcomes, patients who were admitted within four weeks of data extraction on 29 June 2020 were excluded.
T96	9221-9314	Sentence	denotes	We included patients without an outcome after four weeks and considered to have had no event.
T97	9315-9428	Sentence	denotes	A sensitivity analysis was also performed, with stratification of the validation cohort by geographical location.
T98	9429-9798	Sentence	denotes	We selected this geographical categorisation based on well described economic and health inequalities between the north and south of the United Kingdom.2324 Recent analysis has shown the impact of deprivation on risk of dying with covid-19.25 As a result, population differences between regions could change the discriminatory performance of risk stratification scores.
T99	9799-10091	Sentence	denotes	Two geographical cohorts were created, based on north-south geographical locations across the UK as defined by Hacking and colleagues.23 We performed a further sensitivity analysis to determine model performance in ethnic minority groups given the reported differences in covid-19 outcomes.26
T100	10092-10188	Sentence	denotes	All tests were two tailed and P values less than 0.05 were considered statistically significant.
T101	10189-10330	Sentence	denotes	We used R (version 3.6.3) with the finalfit, mice, glmnet, pROC, recipes, xgboost, rmda, and tidyverse packages for all statistical analysis.
T102	10332-10383	Sentence	denotes	Comparison with existing risk stratification scores
T103	10384-10493	Sentence	denotes	All derived models in the derivation dataset were compared within the validation cohort with existing scores.
T104	10494-10686	Sentence	denotes	We assessed model performance by using the area under the receiver operating characteristic curve statistic, sensitivity, specificity, positive predictive value, and negative predictive value.
T105	10687-10831	Sentence	denotes	Existing risk stratification scores were identified through a systematic literature search of Embase, WHO Medicus, and Google Scholar databases.
T106	10832-11016	Sentence	denotes	We used the search terms “pneumonia,” “sepsis,” “influenza,” “COVID-19,” “SARS-CoV-2,” “coronavirus” combined with “score” and “prognosis.” We applied no language or date restrictions.
T107	11017-11062	Sentence	denotes	The last search was performed on 1 July 2020.
T108	11063-11197	Sentence	denotes	Risk stratification tools were included whose variables were available within the database and had accessible methods for calculation.
T109	11198-11404	Sentence	denotes	We calculated performance characteristics according to original publications, and selected score cutoff values for adverse outcomes based on the most commonly used criteria identified within the literature.
T110	11405-11550	Sentence	denotes	Cut-off values were the score value for which the patient was considered at low or high risk of adverse outcome, as defined by the study authors.
T111	11551-11640	Sentence	denotes	Patients with one or more missing input variables were omitted for that particular score.
T112	11641-11802	Sentence	denotes	We also performed a decision curve analysis.27 Briefly, assessment of the adequacy of clinical prediction models can be extended by determining clinical utility.
T113	11803-12014	Sentence	denotes	By using decision curve analysis, we can make a clinical judgment about the relative value of benefits (treating a true positive) and harms (treating a false positive) associated with a clinical prediction tool.
T114	12015-12244	Sentence	denotes	The standardised net benefit was plotted against the threshold probability for considering a patient high risk for age alone and for the best discriminating models applicable to more than 50% of patients in the validation cohort.
T115	12246-12276	Sentence	denotes	Patient and public involvement
T116	12277-12391	Sentence	denotes	This was an urgent public health research study in response to a Public Health Emergency of International Concern.
T117	12392-12502	Sentence	denotes	Patients or the public were not involved in the design, conduct, or reporting of this rapid response research.

LitCovid-PD-HP

Id	Subject	Object	Predicate	Lexical cue	hp_id
T5	1767-1773	Phenotype	denotes	asthma	http://purl.obolibrary.org/obo/HP_0002099
T6	1857-1870	Phenotype	denotes	liver disease	http://purl.obolibrary.org/obo/HP_0001392
T7	1872-1880	Phenotype	denotes	dementia	http://purl.obolibrary.org/obo/HP_0000726
T8	1942-1959	Phenotype	denotes	diabetes mellitus	http://purl.obolibrary.org/obo/HP_0000819
T9	2287-2294	Phenotype	denotes	obesity	http://purl.obolibrary.org/obo/HP_0001513
T10	3721-3730	Phenotype	denotes	pneumonia	http://purl.obolibrary.org/obo/HP_0002090
T11	8639-8648	Phenotype	denotes	pneumonia	http://purl.obolibrary.org/obo/HP_0002090
T12	10858-10867	Phenotype	denotes	pneumonia	http://purl.obolibrary.org/obo/HP_0002090
T13	10871-10877	Phenotype	denotes	sepsis	http://purl.obolibrary.org/obo/HP_0100806

PMC:7116472 / 5135-17637 JSONTXT

Annnotations TAB JSON ListView MergeView

LitCovid-PubTator

LitCovid-sentences

LitCovid-PD-HP

PMC:7116472 / 5135-17637 JSON TXT