PMC:7160614 / 2591-30532 JSONTXT 15 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T27 0-12 Sentence denotes Introduction
T28 13-211 Sentence denotes On January 30, 2020, the World Health Organization (WHO) has declared the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak as a global health emergency of international concern.
T29 212-306 Sentence denotes This outbreak has infected all provinces of China and rapidly spread to the rest of the world.
T30 307-430 Sentence denotes At the time of writing this article (March 16, 2020), there have been more than 158 countries and territories affected [1].
T31 431-759 Sentence denotes Whole-genome sequencing and phylogenetic analysis reveal that the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is similar to some beta coronaviruses detected in bats, but it is distinct from severe acute respiratory syndrome coronavirus (SARS-Cov) and Middle East respiratory syndrome coronavirus (MERS-CoV) [2].
T32 760-888 Sentence denotes Patients with COVID-19 develop pneumonia with associated symptoms of fever (98%), cough (76%), and myalgia or fatigue (44%) [3].
T33 889-987 Sentence denotes CT imaging plays a critical role in the diagnosis and the monitoring of disease progression [4–6].
T34 988-1215 Sentence denotes The latest research studies described the characteristic imaging manifestations of COVID-19, including ground-glass opacities (GGO) (57 to 88%), bilateral involvement (76 to 88%), and peripheral distribution (33 to 85%) [7–10].
T35 1216-1352 Sentence denotes Other imaging features such as consolidation, cavitation, and interlobular septal thickening are also reported in some patients [11–13].
T36 1353-1473 Sentence denotes However, these imaging manifestations of COVID-19 are nonspecific and are difficult to distinguish from other pneumonia.
T37 1474-1630 Sentence denotes To our knowledge, there have been no studies explicitly comparing imaging and clinical characteristics between pneumonia patients with and without COVID-19.
T38 1631-1828 Sentence denotes The current diagnostic criterion for COVID-19 is the positive result of a nucleic acid test by real-time reverse transcription polymerase chain reaction (RT-PCR) or next-generation sequencing [14].
T39 1829-1996 Sentence denotes However, false-negative results caused by unstable specimen processing are relatively high in clinical practice, which has worsened the spread of the outbreak [15–18].
T40 1997-2111 Sentence denotes Moreover, laboratory testing for SARS-CoV-2 requires a rigorous platform, which is not assembled in all hospitals.
T41 2112-2186 Sentence denotes Thus, this requires specimen transfer, which may delay diagnosis for days.
T42 2187-2330 Sentence denotes Early and accurate diagnosis is crucial, particularly for critically ill patients who need emergency surgery, and with pneumonia complications.
T43 2331-2504 Sentence denotes To solve these problems, we hypothesize that a diagnostic model can be developed based on CT imaging and clinical manifestations alone, independent of the nucleic acid test.
T44 2505-2630 Sentence denotes In this study, we identify the differences in imaging and clinical manifestations between patients with and without COVID-19.
T45 2631-2744 Sentence denotes We also develop and validate a model for COVID-19 diagnosis based on radiological semantic and clinical features.
T46 2746-2766 Sentence denotes Patients and methods
T47 2768-2776 Sentence denotes Patients
T48 2777-2928 Sentence denotes Ethical approvals by the institutional review boards were obtained for this retrospective analysis, and the need to obtain informed consent was waived.
T49 2929-3246 Sentence denotes From January 1 to February 8, 2020, seventy consecutive patients with COVID-19 admitted in 5 independent hospitals from 4 cities were enrolled in this study (mean age, 42.9 years; range, 16–69 years), including 41 men (mean age, 41.8 years; range, 16–69 years) and 29 women (mean age, 44.5 years; range, 16–66 years).
T50 3247-3352 Sentence denotes All patients were confirmed with SARS-CoV-2 infection by real-time RT-PCR and next-generation sequencing.
T51 3353-3477 Sentence denotes Of these patients, 24 were from Huizhou City, 25 from Shantou City, 15 from Yongzhou City, and the rest 6 from Meizhou City.
T52 3478-3764 Sentence denotes At the same period, another 66 pneumonia patients without COVID-19 from Meizhou People’s Hospital were recruited as controls (mean age, 46.7 years; range, 0.3–93 years), including 43 men (mean age, 46.0 years; range, 0.3–93 years) and 23 women (mean age, 48.0 years; range, 1–86 years).
T53 3765-3837 Sentence denotes All the controls were confirmed with consecutive negative RT-PCR assays.
T54 3838-3987 Sentence denotes Figure E1 in the Supplementary Material shows the patient recruitment pathway for the control group, along with the inclusion and exclusion criteria.
T55 3988-4125 Sentence denotes According to previous studies [19–21], whose sample size is comparable with ours, the ratio between primary and validation cohort is 7:3.
T56 4126-4246 Sentence denotes In this study, a total of 136 patients were divided into primary (n = 98) and validation (n = 38) cohorts, close to 7:3.
T57 4247-4528 Sentence denotes A total of 19 COVID-19 patients from two hospitals (6 patients from Meizhou People’s Hospital and 13 patients from the First Affiliated Hospital of Shantou University Medical College) and 19 randomly selected controls from Meizhou City were incorporated into the validation cohort.
T58 4529-4702 Sentence denotes The rest of the patients are incorporated in the primary cohort, including 51 COVID-19 patients from Huizhou, Yongzhou, and Shantou cities and 47 controls from Meizhou City.
T59 4703-4893 Sentence denotes The primary cohort was utilized to select the most valuable features and build the predictive model, and the validation cohort was used to evaluate and validate the performance of the model.
T60 4895-4929 Sentence denotes Image and clinical data collection
T61 4930-5230 Sentence denotes The chest CT imaging data without contrast material enhancement were obtained from multiple hospitals with different CT systems, including GE CT Discovery 750 HD (General Electric Company), SCENARIA 64 CT (Hitachi Medical), Philips Ingenuity CT (PHILIPS), and Siemens SOMATOM Definition AS (Siemens).
T62 5231-5305 Sentence denotes All images were reconstructed into 1-mm slices with a slice gap of 0.8 mm.
T63 5306-5395 Sentence denotes Detailed acquisition parameters were summarized in the Supplementary Material (Table E1).
T64 5396-5490 Sentence denotes The clinical history, nursing records, and laboratory findings were reviewed for all patients.
T65 5491-5683 Sentence denotes Clinical characteristics, including demographic information, daily body temperature, blood pressure, heart rate, clinical symptoms, and history of exposure to epidemic centers, were collected.
T66 5684-5908 Sentence denotes Total white blood cell (WBC) counts, lymphocyte counts, ratio of lymphocyte, neutrophil count, ratio of neutrophil, procalcitonin (PCT), C-reactive protein level (CRP), and erythrocyte sedimentation rate (ESR) were measured.
T67 5909-6024 Sentence denotes All threshold values chosen for laboratory metrics were based on the normal ranges set by each individual hospital.
T68 6026-6040 Sentence denotes Image analysis
T69 6041-6230 Sentence denotes For extraction of radiological semantic features, two senior radiologists (D.L. and X.C., more than 15 years of experience) reached a consensus, blinded to clinical and laboratory findings.
T70 6231-6326 Sentence denotes The radiological semantic features included both qualitative and quantitative imaging features.
T71 6327-6475 Sentence denotes The lesions in the outer third of the lung were defined as peripheral, and lesions in the inner two-thirds of the lung were defined as central [22].
T72 6476-6706 Sentence denotes The progression of COVID-19 lesions within each lung lobe was evaluated by scoring each lobe from 0 to 4 [7], corresponding to normal, 1~25% infection, 26~50% infection, 51~75% infection, and more than 75% infection, respectively.
T73 6707-6797 Sentence denotes The scores were combined for all five lobes to provide a total score ranging from 0 to 20.
T74 6798-6903 Sentence denotes A total of 41 radiological features (26 quantitative and 15 qualitative) were extracted for the analysis.
T75 6904-7007 Sentence denotes The descriptions of radiological semantic features are listed in the Supplementary Material (Table E2).
T76 7008-7064 Sentence denotes Figure 1 is one example of the evaluation of CT imaging.
T77 7065-7146 Sentence denotes Fig. 1 A 23-year-old female with a travel history to Wuhan presenting with fever.
T78 7147-7273 Sentence denotes Axial noncontrast CT image shows a consolidation with ground-glass opacities in the peripheral region by the right upper lobe.
T79 7274-7309 Sentence denotes Air bronchogram is found in lesion.
T80 7310-7351 Sentence denotes The maximum diameter of lesion is 2.8 cm.
T81 7352-7437 Sentence denotes The right upper lobe score is 1 because of the involved lung parenchyma less than 1/4
T82 7439-7482 Sentence denotes Clinical and radiological feature selection
T83 7483-7685 Sentence denotes To obtain the most valuable clinical and radiological semantic features, statistical analysis, univariate analysis, and the least absolute shrinkage and selection operator (LASSO) method were performed.
T84 7686-7884 Sentence denotes In statistical analysis, the chi-square test, the Kruskal-Wallis H test, and t test were utilized to compare the radiological semantic and clinical features between COVID-19 and non-COVID-19 groups.
T85 7885-7943 Sentence denotes The features with p value smaller than 0.05 were selected.
T86 7944-8072 Sentence denotes Then, univariate analysis was performed for clinical and radiological candidate features to determine the COVID-19 risk factors.
T87 8073-8159 Sentence denotes The features with p value smaller than 0.05 in univariate analysis were also selected.
T88 8160-8383 Sentence denotes The least absolute shrinkage and selection operator (LASSO) method [23] was utilized to select the most useful features with penalty parameter tuning that was conducted by 10-fold cross-validation based on minimum criteria.
T89 8384-8487 Sentence denotes Diagnostic models were then constructed by multivariate logistic regression with the selected features.
T90 8488-8606 Sentence denotes The flowchart of the feature selection process for these models was presented in the Supplementary Material (Fig. E2).
T91 8608-8658 Sentence denotes Development and validation of the diagnostic model
T92 8659-8954 Sentence denotes To develop an optimal model, we evaluated 3 models by analyzing (i) the clinical features model (C model), (ii) radiological semantic features model (R model), and (iii) the combination of clinical and radiological semantic features model (CR model) by multivariate logistic regression analysis.
T93 8955-9084 Sentence denotes The classification performances of the models were evaluated by the area under the receiver operating characteristic (ROC) curve.
T94 9085-9177 Sentence denotes The area under the curve (AUC), accuracy, sensitivity, and specificity were also calculated.
T95 9178-9379 Sentence denotes A decision curve analysis was conducted to determine the clinical usefulness of the diagnostic model by quantifying the net benefits at different threshold probabilities in the validation dataset [24].
T96 9380-9459 Sentence denotes The development of decision curve was described in the Supplementary Materials.
T97 9460-9541 Sentence denotes Figure 2 depicts the flowchart of the proposed analysis pipeline described above.
T98 9542-9746 Sentence denotes We also built a nomogram, which was a quantitative tool to predict the individual probability of infection by COVID-19, based on the multivariate logistic analysis of the CR model with the primary cohort.
T99 9747-9903 Sentence denotes Depending on the coefficient of the predictive factors in multivariate logistic regression model, all values of each predictive factor were assigned points.
T100 9904-9983 Sentence denotes A total point was obtained by summing all the points of each predictive factor.
T101 9984-10094 Sentence denotes The scale also showed the relationship between the total point and the prediction probability in the nomogram.
T102 10095-10242 Sentence denotes The corresponding calibration curves of the CR model in the primary cohort and validation cohort are shown in the Supplementary Material (Fig. E3).
T103 10243-10302 Sentence denotes Fig. 2 Workflow of data process and analysis in this study.
T104 10303-10433 Sentence denotes Radiological semantic features, including qualitative and quantitative imaging features, are extracted from axial lung CT section.
T105 10434-10526 Sentence denotes The clinical manifestation and laboratory parameters are provided by electronic case system.
T106 10527-10641 Sentence denotes Statistical analysis is performed for comparing the different features between COVID-19 and non-COVID-19 patients.
T107 10642-10819 Sentence denotes Univariate analysis, least absolute shrinkage, and selection operator (LASSO) are further performed to determine the COVID-19 risk factors with p < 0.05 in statistical analysis.
T108 10820-10916 Sentence denotes Three models based on the selected features are established by multivariate logistic regression.
T109 10917-11059 Sentence denotes These models include radiological mode (R model), clinical model (C model), and the combination of clinical and radiological model (CR model).
T110 11060-11237 Sentence denotes The performance and clinical benefits of the prediction model are assessed by the area under a receiver operating characteristic (ROC) curve and the decision curve, respectively
T111 11239-11259 Sentence denotes Statistical analysis
T112 11260-11320 Sentence denotes Statistical analysis was conducted with R software (Version:
T113 11321-11354 Sentence denotes 3.6.4, http: www.r-project.org/).
T114 11355-11463 Sentence denotes The reported significance levels were all two-sided, and the statistical significance level was set to 0.05.
T115 11464-11549 Sentence denotes The multivariate logistic regression analysis was performed with the “stats” package.
T116 11550-11610 Sentence denotes Nomogram construction was performed using the “rms” package.
T117 11611-11664 Sentence denotes Decision curve analysis was performed using the “dca.
T118 11665-11676 Sentence denotes R” package.
T119 11678-11685 Sentence denotes Results
T120 11687-11737 Sentence denotes Imaging and clinical manifestations between groups
T121 11738-11943 Sentence denotes The differences between patients with and without COVID-19 for all 67 features (41 imaging and 26 critical clinical features) are shown in Tables 1 and 2 and the Supplementary Materials (Tables E3 and E4).
T122 11944-12091 Sentence denotes The differences between the primary cohort and validation cohort for the same features are shown in the Supplementary Materials (Tables E5 and E6).
T123 12092-12249 Sentence denotes All characteristics except fatigue and white blood cell count in the CR model presented no significant difference between the primary and validation cohorts.
T124 12250-12361 Sentence denotes A total of 1745 lesions were identified, with 1062 from the COVID-19 group and 683 from the non-COVID-19 group.
T125 12362-12441 Sentence denotes Table 1 Radiological semantic features of patients in COVID-19 and non-COVID-19
T126 12442-12497 Sentence denotes Feature Non-COVID-19 (n = 66) COVID-19 (n = 70) p value
T127 12498-12516 Sentence denotes Number of pure GGO
T128 12517-12569 Sentence denotes   Total# 1.00 (0.00, 5.05) 3.50 (0.95, 8.05) 0.018b*
T129 12570-12632 Sentence denotes   Peripheral area# 1.00 (0.00, 4.05) 2.00 (0.00, 6.05) 0.032b*
T130 12633-12720 Sentence denotes   Central/both peripheral and central area# 0.00 (0.00, 0.00) 0.00 (0.00, 2.00) 0.001b*
T131 12721-12740 Sentence denotes Number of mixed GGO
T132 12741-12793 Sentence denotes   Total# 1.00 (0.00, 3.05) 3.00 (1.00, 9.00) 0.001b*
T133 12794-12858 Sentence denotes   Peripheral area# 0.00 (0.00, 2.00) 2.50 (1.00, 6.00) < 0.001b*
T134 12859-12945 Sentence denotes   Central/both peripheral and central area# 0.00 (0.00, 1.05) 0.00 (0.00, 2.00) 0.657b
T135 12946-12975 Sentence denotes Total number of consolidation
T136 12976-13036 Sentence denotes   Consolidation# 1.00 (0.00, 3.00) 0.00 (0.00, 0.05) 0.001b*
T137 13037-13101 Sentence denotes   Pure solid nodules# 0.00 (0.00, 0.00) 0.00 (0.00, 0.00) 0.309b
T138 13102-13171 Sentence denotes   Solid nodules with GGO# 0.00 (0.00, 0.00) 0.00 (0.00, 1.00) 0.033b*
T139 13172-13195 Sentence denotes Total number of lesions
T140 13196-13258 Sentence denotes   Peripheral area# 5.00 (2.00, 9.05) 7.00 (2.00, 13.00) 0.112b
T141 13259-13317 Sentence denotes   Central area# 0.00 (0.00, 3.00) 0.00 (0.00, 1.05) 0.960b
T142 13318-13396 Sentence denotes   Both peripheral and central area# 0.00 (0.00, 2.00) 0.00 (0.00, 2.05) 0.582b
T143 13397-13435 Sentence denotes Interlobular septal thickening 0.009a*
T144 13436-13470 Sentence denotes   Negative 44 (66.67%) 31 (44.29%)
T145 13471-13505 Sentence denotes   Positive 22 (33.33%) 39 (55.71%)
T146 13506-13536 Sentence denotes Crazy paving pattern < 0.001a*
T147 13537-13571 Sentence denotes   Negative 60 (90.91%) 32 (45.71%)
T148 13572-13604 Sentence denotes   Positive 6 (9.09%) 38 (54.29%)
T149 13605-13631 Sentence denotes Tree-in-bud sign < 0.001a*
T150 13632-13666 Sentence denotes   Negative 37 (56.06%) 61 (87.14%)
T151 13667-13700 Sentence denotes   Positive 29 (43.94%) 9 (12.86%)
T152 13701-13727 Sentence denotes Pleural thickening 0.030a*
T153 13728-13762 Sentence denotes   Negative 46 (69.70%) 36 (51.43%)
T154 13763-13797 Sentence denotes   Positive 20 (30.30%) 34 (48.57%)
T155 13798-13848 Sentence denotes Offending vessel augmentation in lesions < 0.001a*
T156 13849-13883 Sentence denotes   Negative 55 (83.33%) 17 (24.29%)
T157 13884-13918 Sentence denotes   Positive 11 (16.67%) 53 (75.71%)
T158 13919-13945 Sentence denotes GGO ground-glass opacities
T159 13946-14089 Sentence denotes #Results are median with interquartile range in parentheses, and the remainder results are measurements with corresponding ratio in parentheses
T160 14090-14151 Sentence denotes *Data with statistical significance. pa: chi-square test, pb:
T161 14152-14168 Sentence denotes Student’s t test
T162 14169-14235 Sentence denotes Table 2 Clinical features of patients in COVID-19 and non-COVID-19
T163 14236-14291 Sentence denotes Feature Non-COVID-19 (n = 66) COVID-19 (n = 70) p value
T164 14292-14295 Sentence denotes Sex
T165 14296-14334 Sentence denotes   Male# 43 (65.15%) 41 (58.57%) 0.430a
T166 14335-14368 Sentence denotes   Female# 23 (34.85%) 29 (41.43%)
T167 14369-14417 Sentence denotes   Age (years) 46.73 ± 25.00 42.93 ± 13.32 0.275b
T168 14418-14429 Sentence denotes Vital signs
T169 14430-14499 Sentence denotes   Systolic blood pressure (mmHg) 126.92 ± 23.07 127.07 ± 15.16 0.965b
T170 14500-14568 Sentence denotes   Diastolic blood pressure (mmHg) 77.74 ± 15.72 80.39 ± 10.51 0.254b
T171 14569-14629 Sentence denotes   Respiration rate (bpm) 25.20 ± 7.29 19.86 ± 1.90 < 0.001b*
T172 14630-14687 Sentence denotes   Heart rate (bpm) 101.59 ± 20.36 86.06 ± 13.34 < 0.001b*
T173 14688-14740 Sentence denotes   Temperature (°C) 37.61 ± 1.06 37.12 ± 0.83 0.003b*
T174 14741-14746 Sentence denotes Signs
T175 14747-14791 Sentence denotes   Dry cough# 56 (84.85%) 48 (68.57%) 0.025a*
T176 14792-14833 Sentence denotes   Fatigue# 8 (12.12%) 22 (31.43%) 0.007a*
T177 14834-14876 Sentence denotes   Sore throat# 6 (9.09%) 9 (12.86%) 0.483a
T178 14877-14913 Sentence denotes   Stuffy# 4 (6.06%) 2 (2.86%) 0.623a
T179 14914-14954 Sentence denotes   Runny nose# 3 (4.55%) 3 (4.29%) 0.731a
T180 14955-15022 Sentence denotes White blood cell count (× 109/L) 11.48 ± 5.36 5.27 ± 2.33 < 0.001b*
T181 15023-15064 Sentence denotes White blood cell count category < 0.001c*
T182 15065-15091 Sentence denotes   Low# 0 (0.00%) 2 (2.86%)
T183 15092-15125 Sentence denotes   Normal# 27 (40.91%) 63 (90.00%)
T184 15126-15155 Sentence denotes   High# 39 (59.09%) 5 (7.14%)
T185 15156-15213 Sentence denotes Lymphocyte count (× 109/L) 1.57 ± 1.33 1.25 ± 0.68 0.086b
T186 15214-15249 Sentence denotes Lymphocyte count category < 0.001c*
T187 15250-15280 Sentence denotes   Low# 24 (36.36%) 32 (45.71%)
T188 15281-15314 Sentence denotes   Normal# 35 (53.03%) 37 (52.86%)
T189 15315-15343 Sentence denotes   High# 7 (10.61%) 1 (1.43%)
T190 15344-15404 Sentence denotes Neutrophil count (× 109/L) 8.97 ± 4.90 3.53 ± 2.17 < 0.001b*
T191 15405-15440 Sentence denotes Neutrophil count category < 0.001c*
T192 15441-15468 Sentence denotes   Low# 3 (4.55%) 8 (11.43%)
T193 15469-15502 Sentence denotes   Normal# 23 (34.85%) 59 (84.29%)
T194 15503-15532 Sentence denotes   High# 40 (60.61%) 3 (4.29%)
T195 15533-15596 Sentence denotes C-reactive protein (mg/L) 69.30 ± 65.88 26.37 ± 30.97 < 0.001b*
T196 15597-15650 Sentence denotes Procalcitonin (ng/mL) 3.36 ± 8.98 0.26 ± 0.84 0.007b*
T197 15651-15712 Sentence denotes *Data with statistical significance. pa: chi-square test, pb:
T198 15713-15734 Sentence denotes Student’s t test. pc:
T199 15735-15756 Sentence denotes Kruskal-Wallis H test
T200 15757-15822 Sentence denotes #Results are measurements with corresponding ratio in parentheses
T201 15823-15913 Sentence denotes For imaging manifestations, 7 patients in the COVID-19 group showed normal chest CT (10%).
T202 15914-16047 Sentence denotes COVID-19 patients have a greater number of pure GGO and mixed GGO than non-COVID-19 patients (p = 0.018 and p = 0.001, respectively).
T203 16048-16166 Sentence denotes For pure GGO lesions, the differences are significant both in peripheral (p = 0.032) and in central areas (p = 0.001).
T204 16167-16324 Sentence denotes However, the number of mixed GGO is mainly distributed at the periphery in COVID-19 patients (p < 0.001), with no statistical difference in the central area.
T205 16325-16410 Sentence denotes The consolidation lesions without GGO occurred less in COVID-19 patients (p = 0.001).
T206 16411-16552 Sentence denotes More lesions are between 1 and 3 cm (p = 0.027), and fewer lesions are larger than half of the lung segment (p = 0.017) in COVID-19 patients.
T207 16553-16911 Sentence denotes Other significant differences between the two groups include the pleural traction sign (p = 0.019), bronchial wall thickening (p < 0.001), interlobular septal thickening (p = 0.009), crazy paving (p < 0.001), tree-in-bud (p < 0.001), pleural effusions (p < 0.001), pleural thickening (p = 0.030), and the offending vessel augmentation in lesions (p < 0.001).
T208 16912-17007 Sentence denotes The lung score presents no significant difference between the COVID-19 and non-COVID-19 groups.
T209 17008-17124 Sentence denotes Comparison of clinical features between the two groups of patients with and without COVID-19 is reported in Table 2.
T210 17125-17198 Sentence denotes There is no significant difference in age and sex between the two groups.
T211 17199-17344 Sentence denotes Significant differences are found in common symptoms between groups, including fever (p = 0.003), dry cough (p = 0.025), and fatigue (p = 0.007).
T212 17345-17455 Sentence denotes The respiration rate and heart rate also show significant differences between the two groups (both p < 0.001).
T213 17456-17577 Sentence denotes Compared with non-COVID-19 pneumonia, the reduction of the WBC count is more pronounced in COVID-19 patients (p < 0.001).
T214 17578-17702 Sentence denotes The ratio of lymphocyte and ratio of neutrophil also show a significant difference between COVID-19 and non-COVID-19 groups.
T215 17703-17850 Sentence denotes Although lymphopenia was observed in 32 COVID-19 patients (45.71%), it is not statistically different compared with that in the non-COVID-19 group.
T216 17851-18002 Sentence denotes C-creative protein (CRP) level and procalcitonin level are also significantly different between the two groups (p < 0.001 and p = 0.007, respectively).
T217 18003-18070 Sentence denotes Most COVID-19 patients present normal procalcitonin level (82.86%).
T218 18072-18115 Sentence denotes Clinical and radiological feature selection
T219 18116-18260 Sentence denotes Of the features, 18 radiological features and 17 clinical features were selected to form the predictors based on the result from Tables 1 and 2.
T220 18261-18330 Sentence denotes Table 3 lists the features selected by univariate analysis and LASSO.
T221 18331-18379 Sentence denotes Table 3 Selected features in C, R, and CR models
T222 18380-18422 Sentence denotes Model and individual features Coefficients
T223 18423-18437 Sentence denotes R, n = 8 (41)*
T224 18438-18457 Sentence denotes   Intercept − 0.307
T225 18458-18510 Sentence denotes   Total number of mixed GGO in peripheral area 0.359
T226 18511-18550 Sentence denotes   Total number of consolidation − 1.262
T227 18551-18616 Sentence denotes   Total number of solid nodules with ground-glass opacities 0.452
T228 18617-18657 Sentence denotes   Interlobular septal thickening − 5.559
T229 18658-18686 Sentence denotes   Crazy paving pattern 3.566
T230 18687-18708 Sentence denotes   Tree-in-bud − 2.548
T231 18709-18735 Sentence denotes   Pleural thickening 3.265
T232 18736-18784 Sentence denotes   Offending vessel augmentation in lesions 5.504
T233 18785-18799 Sentence denotes C, n = 7 (26)*
T234 18800-18818 Sentence denotes   Intercept 29.273
T235 18819-18840 Sentence denotes   Respiration − 0.359
T236 18841-18861 Sentence denotes   Heart rate − 0.054
T237 18862-18883 Sentence denotes   Temperature − 0.289
T238 18884-18916 Sentence denotes   White blood cell count − 0.175
T239 18917-18932 Sentence denotes   Cough − 1.866
T240 18933-18948 Sentence denotes   Fatigue 2.855
T241 18949-18984 Sentence denotes   Lymphocyte count category − 0.028
T242 18985-19001 Sentence denotes CR, n = 10 (67)*
T243 19002-19020 Sentence denotes   Intercept 45.117
T244 19021-19073 Sentence denotes   Total number of mixed GGO in peripheral area 0.108
T245 19074-19095 Sentence denotes   Tree-in-bud − 1.853
T246 19096-19144 Sentence denotes   Offending vessel augmentation in lesions 6.000
T247 19145-19166 Sentence denotes   Respiration − 0.583
T248 19167-19188 Sentence denotes   Heart ratio − 0.084
T249 19189-19210 Sentence denotes   Temperature − 0.536
T250 19211-19243 Sentence denotes   White blood cell count − 0.471
T251 19244-19259 Sentence denotes   Cough − 0.997
T252 19260-19277 Sentence denotes   Fatigue − 0.228
T253 19278-19313 Sentence denotes   Lymphocyte count category − 2.177
T254 19314-19496 Sentence denotes C, R, and CR indicate the predicted model based on clinical features, radiological features, and the combination of clinical features and clinical radiological features, respectively
T255 19497-19582 Sentence denotes *n means corresponding selected features, and data in parentheses are total features.
T256 19583-19695 Sentence denotes Coefficients: the estimate value of each feature in multivariate logistic regression model by “glm” package in R
T257 19697-19729 Sentence denotes Model development and validation
T258 19730-19931 Sentence denotes The prediction models based on (i) clinical features (C model), (ii) radiological features (R model), and (iii) the combination of clinical features and radiological features (CR model) were developed.
T259 19932-20015 Sentence denotes ROC analyses for the primary and validation cohort are shown in Table 4 and Fig. 3.
T260 20016-20208 Sentence denotes The CR model yielded a maximum AUC of 0.986 (95% CI 0.966~1.000) in the primary cohort with the highest accuracy and specificity, which was 0.936 (95% CI 0.866~1.000) in the validation cohort.
T261 20209-20347 Sentence denotes The AUC for the C model was 0.952 (95% CI 0.988~0.915) and 0.967 (95% CI 0.919~1.000) in the primary and validation cohorts, respectively.
T262 20348-20468 Sentence denotes For the R model, the AUC of the two cohorts was 0.969 (95% CI 0.940~0.997) and 0.809 (95% CI 0.669~0.948), respectively.
T263 20469-20528 Sentence denotes Table 4 Performance of the individualized prediction models
T264 20529-20579 Sentence denotes Primary cohort (n = 98) Validation cohort (n = 38)
T265 20580-20674 Sentence denotes Models AUC 95% CI Accuracy Specificity Sensitivity AUC 95% CI Accuracy Specificity Sensitivity
T266 20675-20754 Sentence denotes C model 0.952 0.915~0.988 0.888 0.894 0.882 0.967 0.919~1.000 0.868 0.859 0.842
T267 20755-20834 Sentence denotes R model 0.969 0.940~0.997 0.929 0.851 1.000 0.809 0.669~0.948 0.684 0.368 1.000
T268 20835-20915 Sentence denotes CR model 0.986 0.966~1.000 0.959 0.957 0.961 0.936 0.866~1.000 0.763 0.789 0.737
T269 20916-21099 Sentence denotes C, R, and CR indicate the predicted model based on clinical features, radiological features, and the combination of clinical features and clinical radiological features, respectively.
T270 21100-21122 Sentence denotes CI confidence interval
T271 21123-21194 Sentence denotes Fig. 3 ROC of the three models in primary and validation cohort curves.
T272 21195-21462 Sentence denotes Comparison of receiver operating characteristic (ROC) curves among the radiological mode (R model), clinical model (C model), and the combination of clinical and radiological model (CR model) for the diagnosis of COVID-19 in the primary (a) and validation (b) cohorts
T273 21463-21668 Sentence denotes To determine the clinical usefulness of the diagnostic model, we developed the decision curve (Fig. 4), which showed better performances for the CR model compared with that for the C model and the R model.
T274 21669-21852 Sentence denotes Across the majority of the range of reasonable threshold probabilities, the decision curve analysis showed that the CR model had a higher overall benefit than the C model and R model.
T275 21853-21922 Sentence denotes Fig. 4 Decision curve analysis for each model in the primary dataset.
T276 21923-22217 Sentence denotes The y-axis measures the net benefit, which is calculated by summing the benefits (true-positive findings) and subtracting the harms (false-positive findings), weighting the latter by a factor related to the relative harm of undetected metastasis compared with the harm of unnecessary treatment.
T277 22218-22476 Sentence denotes The decision curve shows that if the threshold probability is over 10%, the application of the combination of clinical and radiological model (CR model) to diagnose COVID-19 adds more benefit than the clinical model (C model) and radiological model (R model)
T278 22477-22824 Sentence denotes The nomogram (Fig. 5) was developed by the CR model in the primary cohort, with the factors of the total number of mixed GGO in peripheral area (TN_Mixed_GGO_IP), tree-in-bud, offending vessel augmentation in lesions (OVAIL), respiration, heart ratio, temperature, white blood cell count, cough, fatigue and lymphocyte count category incorporated.
T279 22825-22929 Sentence denotes The total points were calculated by summing the points identified on the “points” scale for each factor.
T280 22930-23064 Sentence denotes By comparing the “total points” scale and the “probability” scale, the individual probability of COVID-19 infection could be obtained.
T281 23065-23119 Sentence denotes Fig. 5 Nomogram of the CR model in the primary cohort.
T282 23120-23197 Sentence denotes TN_Mixed_GGO_IP represented the total number of mixed GGO in peripheral area.
T283 23198-23257 Sentence denotes AVAIL represented offending vessel segmentation in lesions.
T284 23258-23311 Sentence denotes N was a negative result, and P was a positive result.
T285 23312-23336 Sentence denotes Norm represented normal.
T286 23337-23399 Sentence denotes Note that in probability scale, 0 = non-COVID-19, 1 = COVID-19
T287 23401-23411 Sentence denotes Discussion
T288 23412-23577 Sentence denotes In this multi-center study, statistical analysis was performed in comparing imaging and clinical manifestations between pneumonia patients with and without COVID-19.
T289 23578-23730 Sentence denotes Eighteen radiological semantic features and seventeen clinical features were identified to be significantly different between the two groups (p < 0.05).
T290 23731-23812 Sentence denotes Three models for COVID-19 diagnosis were developed based on the refined features.
T291 23813-23919 Sentence denotes The models were validated in the both primary and validation cohorts and achieved an AUC as high as 0.986.
T292 23920-24111 Sentence denotes These models will play an essential role for early and easy-to-access diagnosis, especially when there are not enough RT-PCT kits or experimental platforms to test for the COVID-19 infection.
T293 24112-24213 Sentence denotes A total of 1745 lesions were evaluated for the qualitative feature, location, and size in this study.
T294 24214-24407 Sentence denotes Consistent with the previous studies, the ground-glass opacities and consolidation in the lung periphery were considered to be the imaging hallmark in patients with COVID-19 infection [11, 25].
T295 24408-24551 Sentence denotes However, when we subdivided the GGO into pure GGO and mixed GGO, we found that the distribution pattern is different between these two lesions.
T296 24552-24714 Sentence denotes Pure GGO show differences between groups in every location of the lungs, whereas mixed GGO only have significant differences between groups in the lung periphery.
T297 24715-24787 Sentence denotes Recent studies defined four stages of lung involvement in COVID-19 [26].
T298 24788-24864 Sentence denotes Therefore, a follow-up analysis of these distributions would be significant.
T299 24865-24953 Sentence denotes The lesion size in patients with COVID-19 infection was another interesting observation.
T300 24954-25097 Sentence denotes Most lesions were between 1 and 3 cm, with few lesions larger than half of the lung segment, which was similar to the finding in MERS_CoV [22].
T301 25098-25282 Sentence denotes Other features similar to MERS_CoV and SARS_CoV were observed in the laboratory abnormalities, such as lymphopenia, which may be associated with the cellular immune deficiency [3, 27].
T302 25283-25399 Sentence denotes However, our results showed no significant difference in lymphopenia between the COVID-19 and non-COVID-19 patients.
T303 25400-25531 Sentence denotes To our knowledge, no diagnostic model based on imaging and clinical features alone has been proposed for the diagnosis of COVID-19.
T304 25532-25837 Sentence denotes Our clinical and radiological semantic (CR) models consisted of the following features: total number of GGO with consolidation in the peripheral area, tree-in-bud, offending vessel augmentation in lesions, temperature, heart ratio, respiration, cough and fatigue, WBC count, and lymphocyte count category.
T305 25838-25909 Sentence denotes The CR model outperformed the individual clinical and radiologic model.
T306 25910-26109 Sentence denotes This result was in accordance with that in previous study in breast cancer, in which the model based on the combination of radiomics features and clinical features achieved a higher performance [24].
T307 26110-26303 Sentence denotes Compared with the radiomics-based model, the extraction of radiological semantic features can overcome the image discrepancy caused by different scanning parameters and/or different CT vendors.
T308 26304-26500 Sentence denotes A previous study [28] also indicated that models based on semantic features determined by an experienced thoracic radiologist slightly outperformed models based on computed texture features alone.
T309 26501-26543 Sentence denotes There are a few limitations in this study.
T310 26544-26702 Sentence denotes First, the sample size is relatively small because this is a retrospective analysis of a new disease and most of the cases outside of Wuhan City are imported.
T311 26703-27020 Sentence denotes Second, with the multi-center retrospective design, there is a potential bias of patient selection [29], since there may be some deviations in marking semantic features among readers, though we have taken the effort to reduce this by creating pictorial examples and setting feature criteria (Supplementary Materials).
T312 27021-27068 Sentence denotes Third, longitudinal CT study was not performed.
T313 27069-27208 Sentence denotes Whether or not this model can be used to evaluate the follow-ups and help to guide therapy remains an open question to be further explored.
T314 27209-27392 Sentence denotes Moreover, the rich high-order features of the CT image combined with radiomics or deep learning have not been studied, which may be another way to identify the patients with COVID-19.
T315 27393-27528 Sentence denotes Besides, one can also focus on the role of radiological features in disease monitoring, treatment evaluation, and prognosis prediction.
T316 27529-27640 Sentence denotes In conclusion, 1745 lesions and 67 features were compared between pneumonia patients with and without COVID-19.
T317 27641-27714 Sentence denotes Thirty-five features were significantly different between the two groups.
T318 27715-27885 Sentence denotes A diagnostic model with AUC as high as 0.986 was developed and validated both in the primary and in the validation cohorts, which may help improve the COVID-19 diagnosis.
T319 27887-27920 Sentence denotes Electronic supplementary material
T320 27922-27941 Sentence denotes ESM 1 (DOCX 404 kb)