> top > projects > LitCovid-sentences > docs > PMC:7160614 > annotations

PMC:7160614 JSONTXT 22 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T1 0-133 Sentence denotes A diagnostic model for coronavirus disease 2019 (COVID-19) based on radiological semantic and clinical features: a multi-center study
T2 135-143 Sentence denotes Abstract
T3 144-154 Sentence denotes Objectives
T4 155-255 Sentence denotes Rapid and accurate diagnosis of coronavirus disease 2019 (COVID-19) is critical during the epidemic.
T5 256-502 Sentence denotes We aim to identify differences in CT imaging and clinical manifestations between pneumonia patients with and without COVID-19, and to develop and validate a diagnostic model for COVID-19 based on radiological semantic and clinical features alone.
T6 504-511 Sentence denotes Methods
T7 512-641 Sentence denotes A consecutive cohort of 70 COVID-19 and 66 non-COVID-19 pneumonia patients were retrospectively recruited from five institutions.
T8 642-718 Sentence denotes Patients were divided into primary (n = 98) and validation (n = 38) cohorts.
T9 719-857 Sentence denotes The chi-square test, Student’s t test, and Kruskal-Wallis H test were performed, comparing 1745 lesions and 67 features in the two groups.
T10 858-979 Sentence denotes Three models were constructed using radiological semantic and clinical features through multivariate logistic regression.
T11 980-1081 Sentence denotes Diagnostic efficacies of developed models were quantified by receiver operating characteristic curve.
T12 1082-1151 Sentence denotes Clinical usage was evaluated by decision curve analysis and nomogram.
T13 1153-1160 Sentence denotes Results
T14 1161-1279 Sentence denotes Eighteen radiological semantic features and seventeen clinical features were identified to be significantly different.
T15 1280-1463 Sentence denotes Besides ground-glass opacities (p = 0.032) and consolidation (p = 0.001) in the lung periphery, the lesion size (1–3 cm) is also significant for the diagnosis of COVID-19 (p = 0.027).
T16 1464-1522 Sentence denotes Lung score presents no significant difference (p = 0.417).
T17 1523-1624 Sentence denotes Three diagnostic models achieved an area under the curve value as high as 0.986 (95% CI 0.966~1.000).
T18 1625-1747 Sentence denotes The clinical and radiological semantic models provided a better diagnostic performance and more considerable net benefits.
T19 1749-1760 Sentence denotes Conclusions
T20 1761-1886 Sentence denotes Based on CT imaging and clinical manifestations alone, the pneumonia patients with and without COVID-19 can be distinguished.
T21 1887-2010 Sentence denotes A model composed of radiological semantic and clinical features has an excellent performance for the diagnosis of COVID-19.
T22 2012-2022 Sentence denotes Key Points
T23 2023-2150 Sentence denotes • Based on CT imaging and clinical manifestations alone, the pneumonia patients with and without COVID-19 can be distinguished.
T24 2151-2417 Sentence denotes • A diagnostic model for COVID-19 was developed and validated using radiological semantic and clinical features, which had an area under the curve value of 0.986 (95% CI 0.966~1.000) and 0.936 (95% CI 0.866~1.000) in the primary and validation cohorts, respectively.
T25 2419-2452 Sentence denotes Electronic supplementary material
T26 2453-2589 Sentence denotes The online version of this article (10.1007/s00330-020-06829-2) contains supplementary material, which is available to authorized users.
T27 2591-2603 Sentence denotes Introduction
T28 2604-2802 Sentence denotes On January 30, 2020, the World Health Organization (WHO) has declared the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak as a global health emergency of international concern.
T29 2803-2897 Sentence denotes This outbreak has infected all provinces of China and rapidly spread to the rest of the world.
T30 2898-3021 Sentence denotes At the time of writing this article (March 16, 2020), there have been more than 158 countries and territories affected [1].
T31 3022-3350 Sentence denotes Whole-genome sequencing and phylogenetic analysis reveal that the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is similar to some beta coronaviruses detected in bats, but it is distinct from severe acute respiratory syndrome coronavirus (SARS-Cov) and Middle East respiratory syndrome coronavirus (MERS-CoV) [2].
T32 3351-3479 Sentence denotes Patients with COVID-19 develop pneumonia with associated symptoms of fever (98%), cough (76%), and myalgia or fatigue (44%) [3].
T33 3480-3578 Sentence denotes CT imaging plays a critical role in the diagnosis and the monitoring of disease progression [4–6].
T34 3579-3806 Sentence denotes The latest research studies described the characteristic imaging manifestations of COVID-19, including ground-glass opacities (GGO) (57 to 88%), bilateral involvement (76 to 88%), and peripheral distribution (33 to 85%) [7–10].
T35 3807-3943 Sentence denotes Other imaging features such as consolidation, cavitation, and interlobular septal thickening are also reported in some patients [11–13].
T36 3944-4064 Sentence denotes However, these imaging manifestations of COVID-19 are nonspecific and are difficult to distinguish from other pneumonia.
T37 4065-4221 Sentence denotes To our knowledge, there have been no studies explicitly comparing imaging and clinical characteristics between pneumonia patients with and without COVID-19.
T38 4222-4419 Sentence denotes The current diagnostic criterion for COVID-19 is the positive result of a nucleic acid test by real-time reverse transcription polymerase chain reaction (RT-PCR) or next-generation sequencing [14].
T39 4420-4587 Sentence denotes However, false-negative results caused by unstable specimen processing are relatively high in clinical practice, which has worsened the spread of the outbreak [15–18].
T40 4588-4702 Sentence denotes Moreover, laboratory testing for SARS-CoV-2 requires a rigorous platform, which is not assembled in all hospitals.
T41 4703-4777 Sentence denotes Thus, this requires specimen transfer, which may delay diagnosis for days.
T42 4778-4921 Sentence denotes Early and accurate diagnosis is crucial, particularly for critically ill patients who need emergency surgery, and with pneumonia complications.
T43 4922-5095 Sentence denotes To solve these problems, we hypothesize that a diagnostic model can be developed based on CT imaging and clinical manifestations alone, independent of the nucleic acid test.
T44 5096-5221 Sentence denotes In this study, we identify the differences in imaging and clinical manifestations between patients with and without COVID-19.
T45 5222-5335 Sentence denotes We also develop and validate a model for COVID-19 diagnosis based on radiological semantic and clinical features.
T46 5337-5357 Sentence denotes Patients and methods
T47 5359-5367 Sentence denotes Patients
T48 5368-5519 Sentence denotes Ethical approvals by the institutional review boards were obtained for this retrospective analysis, and the need to obtain informed consent was waived.
T49 5520-5837 Sentence denotes From January 1 to February 8, 2020, seventy consecutive patients with COVID-19 admitted in 5 independent hospitals from 4 cities were enrolled in this study (mean age, 42.9 years; range, 16–69 years), including 41 men (mean age, 41.8 years; range, 16–69 years) and 29 women (mean age, 44.5 years; range, 16–66 years).
T50 5838-5943 Sentence denotes All patients were confirmed with SARS-CoV-2 infection by real-time RT-PCR and next-generation sequencing.
T51 5944-6068 Sentence denotes Of these patients, 24 were from Huizhou City, 25 from Shantou City, 15 from Yongzhou City, and the rest 6 from Meizhou City.
T52 6069-6355 Sentence denotes At the same period, another 66 pneumonia patients without COVID-19 from Meizhou People’s Hospital were recruited as controls (mean age, 46.7 years; range, 0.3–93 years), including 43 men (mean age, 46.0 years; range, 0.3–93 years) and 23 women (mean age, 48.0 years; range, 1–86 years).
T53 6356-6428 Sentence denotes All the controls were confirmed with consecutive negative RT-PCR assays.
T54 6429-6578 Sentence denotes Figure E1 in the Supplementary Material shows the patient recruitment pathway for the control group, along with the inclusion and exclusion criteria.
T55 6579-6716 Sentence denotes According to previous studies [19–21], whose sample size is comparable with ours, the ratio between primary and validation cohort is 7:3.
T56 6717-6837 Sentence denotes In this study, a total of 136 patients were divided into primary (n = 98) and validation (n = 38) cohorts, close to 7:3.
T57 6838-7119 Sentence denotes A total of 19 COVID-19 patients from two hospitals (6 patients from Meizhou People’s Hospital and 13 patients from the First Affiliated Hospital of Shantou University Medical College) and 19 randomly selected controls from Meizhou City were incorporated into the validation cohort.
T58 7120-7293 Sentence denotes The rest of the patients are incorporated in the primary cohort, including 51 COVID-19 patients from Huizhou, Yongzhou, and Shantou cities and 47 controls from Meizhou City.
T59 7294-7484 Sentence denotes The primary cohort was utilized to select the most valuable features and build the predictive model, and the validation cohort was used to evaluate and validate the performance of the model.
T60 7486-7520 Sentence denotes Image and clinical data collection
T61 7521-7821 Sentence denotes The chest CT imaging data without contrast material enhancement were obtained from multiple hospitals with different CT systems, including GE CT Discovery 750 HD (General Electric Company), SCENARIA 64 CT (Hitachi Medical), Philips Ingenuity CT (PHILIPS), and Siemens SOMATOM Definition AS (Siemens).
T62 7822-7896 Sentence denotes All images were reconstructed into 1-mm slices with a slice gap of 0.8 mm.
T63 7897-7986 Sentence denotes Detailed acquisition parameters were summarized in the Supplementary Material (Table E1).
T64 7987-8081 Sentence denotes The clinical history, nursing records, and laboratory findings were reviewed for all patients.
T65 8082-8274 Sentence denotes Clinical characteristics, including demographic information, daily body temperature, blood pressure, heart rate, clinical symptoms, and history of exposure to epidemic centers, were collected.
T66 8275-8499 Sentence denotes Total white blood cell (WBC) counts, lymphocyte counts, ratio of lymphocyte, neutrophil count, ratio of neutrophil, procalcitonin (PCT), C-reactive protein level (CRP), and erythrocyte sedimentation rate (ESR) were measured.
T67 8500-8615 Sentence denotes All threshold values chosen for laboratory metrics were based on the normal ranges set by each individual hospital.
T68 8617-8631 Sentence denotes Image analysis
T69 8632-8821 Sentence denotes For extraction of radiological semantic features, two senior radiologists (D.L. and X.C., more than 15 years of experience) reached a consensus, blinded to clinical and laboratory findings.
T70 8822-8917 Sentence denotes The radiological semantic features included both qualitative and quantitative imaging features.
T71 8918-9066 Sentence denotes The lesions in the outer third of the lung were defined as peripheral, and lesions in the inner two-thirds of the lung were defined as central [22].
T72 9067-9297 Sentence denotes The progression of COVID-19 lesions within each lung lobe was evaluated by scoring each lobe from 0 to 4 [7], corresponding to normal, 1~25% infection, 26~50% infection, 51~75% infection, and more than 75% infection, respectively.
T73 9298-9388 Sentence denotes The scores were combined for all five lobes to provide a total score ranging from 0 to 20.
T74 9389-9494 Sentence denotes A total of 41 radiological features (26 quantitative and 15 qualitative) were extracted for the analysis.
T75 9495-9598 Sentence denotes The descriptions of radiological semantic features are listed in the Supplementary Material (Table E2).
T76 9599-9655 Sentence denotes Figure 1 is one example of the evaluation of CT imaging.
T77 9656-9737 Sentence denotes Fig. 1 A 23-year-old female with a travel history to Wuhan presenting with fever.
T78 9738-9864 Sentence denotes Axial noncontrast CT image shows a consolidation with ground-glass opacities in the peripheral region by the right upper lobe.
T79 9865-9900 Sentence denotes Air bronchogram is found in lesion.
T80 9901-9942 Sentence denotes The maximum diameter of lesion is 2.8 cm.
T81 9943-10028 Sentence denotes The right upper lobe score is 1 because of the involved lung parenchyma less than 1/4
T82 10030-10073 Sentence denotes Clinical and radiological feature selection
T83 10074-10276 Sentence denotes To obtain the most valuable clinical and radiological semantic features, statistical analysis, univariate analysis, and the least absolute shrinkage and selection operator (LASSO) method were performed.
T84 10277-10475 Sentence denotes In statistical analysis, the chi-square test, the Kruskal-Wallis H test, and t test were utilized to compare the radiological semantic and clinical features between COVID-19 and non-COVID-19 groups.
T85 10476-10534 Sentence denotes The features with p value smaller than 0.05 were selected.
T86 10535-10663 Sentence denotes Then, univariate analysis was performed for clinical and radiological candidate features to determine the COVID-19 risk factors.
T87 10664-10750 Sentence denotes The features with p value smaller than 0.05 in univariate analysis were also selected.
T88 10751-10974 Sentence denotes The least absolute shrinkage and selection operator (LASSO) method [23] was utilized to select the most useful features with penalty parameter tuning that was conducted by 10-fold cross-validation based on minimum criteria.
T89 10975-11078 Sentence denotes Diagnostic models were then constructed by multivariate logistic regression with the selected features.
T90 11079-11197 Sentence denotes The flowchart of the feature selection process for these models was presented in the Supplementary Material (Fig. E2).
T91 11199-11249 Sentence denotes Development and validation of the diagnostic model
T92 11250-11545 Sentence denotes To develop an optimal model, we evaluated 3 models by analyzing (i) the clinical features model (C model), (ii) radiological semantic features model (R model), and (iii) the combination of clinical and radiological semantic features model (CR model) by multivariate logistic regression analysis.
T93 11546-11675 Sentence denotes The classification performances of the models were evaluated by the area under the receiver operating characteristic (ROC) curve.
T94 11676-11768 Sentence denotes The area under the curve (AUC), accuracy, sensitivity, and specificity were also calculated.
T95 11769-11970 Sentence denotes A decision curve analysis was conducted to determine the clinical usefulness of the diagnostic model by quantifying the net benefits at different threshold probabilities in the validation dataset [24].
T96 11971-12050 Sentence denotes The development of decision curve was described in the Supplementary Materials.
T97 12051-12132 Sentence denotes Figure 2 depicts the flowchart of the proposed analysis pipeline described above.
T98 12133-12337 Sentence denotes We also built a nomogram, which was a quantitative tool to predict the individual probability of infection by COVID-19, based on the multivariate logistic analysis of the CR model with the primary cohort.
T99 12338-12494 Sentence denotes Depending on the coefficient of the predictive factors in multivariate logistic regression model, all values of each predictive factor were assigned points.
T100 12495-12574 Sentence denotes A total point was obtained by summing all the points of each predictive factor.
T101 12575-12685 Sentence denotes The scale also showed the relationship between the total point and the prediction probability in the nomogram.
T102 12686-12833 Sentence denotes The corresponding calibration curves of the CR model in the primary cohort and validation cohort are shown in the Supplementary Material (Fig. E3).
T103 12834-12893 Sentence denotes Fig. 2 Workflow of data process and analysis in this study.
T104 12894-13024 Sentence denotes Radiological semantic features, including qualitative and quantitative imaging features, are extracted from axial lung CT section.
T105 13025-13117 Sentence denotes The clinical manifestation and laboratory parameters are provided by electronic case system.
T106 13118-13232 Sentence denotes Statistical analysis is performed for comparing the different features between COVID-19 and non-COVID-19 patients.
T107 13233-13410 Sentence denotes Univariate analysis, least absolute shrinkage, and selection operator (LASSO) are further performed to determine the COVID-19 risk factors with p < 0.05 in statistical analysis.
T108 13411-13507 Sentence denotes Three models based on the selected features are established by multivariate logistic regression.
T109 13508-13650 Sentence denotes These models include radiological mode (R model), clinical model (C model), and the combination of clinical and radiological model (CR model).
T110 13651-13828 Sentence denotes The performance and clinical benefits of the prediction model are assessed by the area under a receiver operating characteristic (ROC) curve and the decision curve, respectively
T111 13830-13850 Sentence denotes Statistical analysis
T112 13851-13911 Sentence denotes Statistical analysis was conducted with R software (Version:
T113 13912-13945 Sentence denotes 3.6.4, http: www.r-project.org/).
T114 13946-14054 Sentence denotes The reported significance levels were all two-sided, and the statistical significance level was set to 0.05.
T115 14055-14140 Sentence denotes The multivariate logistic regression analysis was performed with the “stats” package.
T116 14141-14201 Sentence denotes Nomogram construction was performed using the “rms” package.
T117 14202-14255 Sentence denotes Decision curve analysis was performed using the “dca.
T118 14256-14267 Sentence denotes R” package.
T119 14269-14276 Sentence denotes Results
T120 14278-14328 Sentence denotes Imaging and clinical manifestations between groups
T121 14329-14534 Sentence denotes The differences between patients with and without COVID-19 for all 67 features (41 imaging and 26 critical clinical features) are shown in Tables 1 and 2 and the Supplementary Materials (Tables E3 and E4).
T122 14535-14682 Sentence denotes The differences between the primary cohort and validation cohort for the same features are shown in the Supplementary Materials (Tables E5 and E6).
T123 14683-14840 Sentence denotes All characteristics except fatigue and white blood cell count in the CR model presented no significant difference between the primary and validation cohorts.
T124 14841-14952 Sentence denotes A total of 1745 lesions were identified, with 1062 from the COVID-19 group and 683 from the non-COVID-19 group.
T125 14953-15032 Sentence denotes Table 1 Radiological semantic features of patients in COVID-19 and non-COVID-19
T126 15033-15088 Sentence denotes Feature Non-COVID-19 (n = 66) COVID-19 (n = 70) p value
T127 15089-15107 Sentence denotes Number of pure GGO
T128 15108-15160 Sentence denotes   Total# 1.00 (0.00, 5.05) 3.50 (0.95, 8.05) 0.018b*
T129 15161-15223 Sentence denotes   Peripheral area# 1.00 (0.00, 4.05) 2.00 (0.00, 6.05) 0.032b*
T130 15224-15311 Sentence denotes   Central/both peripheral and central area# 0.00 (0.00, 0.00) 0.00 (0.00, 2.00) 0.001b*
T131 15312-15331 Sentence denotes Number of mixed GGO
T132 15332-15384 Sentence denotes   Total# 1.00 (0.00, 3.05) 3.00 (1.00, 9.00) 0.001b*
T133 15385-15449 Sentence denotes   Peripheral area# 0.00 (0.00, 2.00) 2.50 (1.00, 6.00) < 0.001b*
T134 15450-15536 Sentence denotes   Central/both peripheral and central area# 0.00 (0.00, 1.05) 0.00 (0.00, 2.00) 0.657b
T135 15537-15566 Sentence denotes Total number of consolidation
T136 15567-15627 Sentence denotes   Consolidation# 1.00 (0.00, 3.00) 0.00 (0.00, 0.05) 0.001b*
T137 15628-15692 Sentence denotes   Pure solid nodules# 0.00 (0.00, 0.00) 0.00 (0.00, 0.00) 0.309b
T138 15693-15762 Sentence denotes   Solid nodules with GGO# 0.00 (0.00, 0.00) 0.00 (0.00, 1.00) 0.033b*
T139 15763-15786 Sentence denotes Total number of lesions
T140 15787-15849 Sentence denotes   Peripheral area# 5.00 (2.00, 9.05) 7.00 (2.00, 13.00) 0.112b
T141 15850-15908 Sentence denotes   Central area# 0.00 (0.00, 3.00) 0.00 (0.00, 1.05) 0.960b
T142 15909-15987 Sentence denotes   Both peripheral and central area# 0.00 (0.00, 2.00) 0.00 (0.00, 2.05) 0.582b
T143 15988-16026 Sentence denotes Interlobular septal thickening 0.009a*
T144 16027-16061 Sentence denotes   Negative 44 (66.67%) 31 (44.29%)
T145 16062-16096 Sentence denotes   Positive 22 (33.33%) 39 (55.71%)
T146 16097-16127 Sentence denotes Crazy paving pattern < 0.001a*
T147 16128-16162 Sentence denotes   Negative 60 (90.91%) 32 (45.71%)
T148 16163-16195 Sentence denotes   Positive 6 (9.09%) 38 (54.29%)
T149 16196-16222 Sentence denotes Tree-in-bud sign < 0.001a*
T150 16223-16257 Sentence denotes   Negative 37 (56.06%) 61 (87.14%)
T151 16258-16291 Sentence denotes   Positive 29 (43.94%) 9 (12.86%)
T152 16292-16318 Sentence denotes Pleural thickening 0.030a*
T153 16319-16353 Sentence denotes   Negative 46 (69.70%) 36 (51.43%)
T154 16354-16388 Sentence denotes   Positive 20 (30.30%) 34 (48.57%)
T155 16389-16439 Sentence denotes Offending vessel augmentation in lesions < 0.001a*
T156 16440-16474 Sentence denotes   Negative 55 (83.33%) 17 (24.29%)
T157 16475-16509 Sentence denotes   Positive 11 (16.67%) 53 (75.71%)
T158 16510-16536 Sentence denotes GGO ground-glass opacities
T159 16537-16680 Sentence denotes #Results are median with interquartile range in parentheses, and the remainder results are measurements with corresponding ratio in parentheses
T160 16681-16742 Sentence denotes *Data with statistical significance. pa: chi-square test, pb:
T161 16743-16759 Sentence denotes Student’s t test
T162 16760-16826 Sentence denotes Table 2 Clinical features of patients in COVID-19 and non-COVID-19
T163 16827-16882 Sentence denotes Feature Non-COVID-19 (n = 66) COVID-19 (n = 70) p value
T164 16883-16886 Sentence denotes Sex
T165 16887-16925 Sentence denotes   Male# 43 (65.15%) 41 (58.57%) 0.430a
T166 16926-16959 Sentence denotes   Female# 23 (34.85%) 29 (41.43%)
T167 16960-17008 Sentence denotes   Age (years) 46.73 ± 25.00 42.93 ± 13.32 0.275b
T168 17009-17020 Sentence denotes Vital signs
T169 17021-17090 Sentence denotes   Systolic blood pressure (mmHg) 126.92 ± 23.07 127.07 ± 15.16 0.965b
T170 17091-17159 Sentence denotes   Diastolic blood pressure (mmHg) 77.74 ± 15.72 80.39 ± 10.51 0.254b
T171 17160-17220 Sentence denotes   Respiration rate (bpm) 25.20 ± 7.29 19.86 ± 1.90 < 0.001b*
T172 17221-17278 Sentence denotes   Heart rate (bpm) 101.59 ± 20.36 86.06 ± 13.34 < 0.001b*
T173 17279-17331 Sentence denotes   Temperature (°C) 37.61 ± 1.06 37.12 ± 0.83 0.003b*
T174 17332-17337 Sentence denotes Signs
T175 17338-17382 Sentence denotes   Dry cough# 56 (84.85%) 48 (68.57%) 0.025a*
T176 17383-17424 Sentence denotes   Fatigue# 8 (12.12%) 22 (31.43%) 0.007a*
T177 17425-17467 Sentence denotes   Sore throat# 6 (9.09%) 9 (12.86%) 0.483a
T178 17468-17504 Sentence denotes   Stuffy# 4 (6.06%) 2 (2.86%) 0.623a
T179 17505-17545 Sentence denotes   Runny nose# 3 (4.55%) 3 (4.29%) 0.731a
T180 17546-17613 Sentence denotes White blood cell count (× 109/L) 11.48 ± 5.36 5.27 ± 2.33 < 0.001b*
T181 17614-17655 Sentence denotes White blood cell count category < 0.001c*
T182 17656-17682 Sentence denotes   Low# 0 (0.00%) 2 (2.86%)
T183 17683-17716 Sentence denotes   Normal# 27 (40.91%) 63 (90.00%)
T184 17717-17746 Sentence denotes   High# 39 (59.09%) 5 (7.14%)
T185 17747-17804 Sentence denotes Lymphocyte count (× 109/L) 1.57 ± 1.33 1.25 ± 0.68 0.086b
T186 17805-17840 Sentence denotes Lymphocyte count category < 0.001c*
T187 17841-17871 Sentence denotes   Low# 24 (36.36%) 32 (45.71%)
T188 17872-17905 Sentence denotes   Normal# 35 (53.03%) 37 (52.86%)
T189 17906-17934 Sentence denotes   High# 7 (10.61%) 1 (1.43%)
T190 17935-17995 Sentence denotes Neutrophil count (× 109/L) 8.97 ± 4.90 3.53 ± 2.17 < 0.001b*
T191 17996-18031 Sentence denotes Neutrophil count category < 0.001c*
T192 18032-18059 Sentence denotes   Low# 3 (4.55%) 8 (11.43%)
T193 18060-18093 Sentence denotes   Normal# 23 (34.85%) 59 (84.29%)
T194 18094-18123 Sentence denotes   High# 40 (60.61%) 3 (4.29%)
T195 18124-18187 Sentence denotes C-reactive protein (mg/L) 69.30 ± 65.88 26.37 ± 30.97 < 0.001b*
T196 18188-18241 Sentence denotes Procalcitonin (ng/mL) 3.36 ± 8.98 0.26 ± 0.84 0.007b*
T197 18242-18303 Sentence denotes *Data with statistical significance. pa: chi-square test, pb:
T198 18304-18325 Sentence denotes Student’s t test. pc:
T199 18326-18347 Sentence denotes Kruskal-Wallis H test
T200 18348-18413 Sentence denotes #Results are measurements with corresponding ratio in parentheses
T201 18414-18504 Sentence denotes For imaging manifestations, 7 patients in the COVID-19 group showed normal chest CT (10%).
T202 18505-18638 Sentence denotes COVID-19 patients have a greater number of pure GGO and mixed GGO than non-COVID-19 patients (p = 0.018 and p = 0.001, respectively).
T203 18639-18757 Sentence denotes For pure GGO lesions, the differences are significant both in peripheral (p = 0.032) and in central areas (p = 0.001).
T204 18758-18915 Sentence denotes However, the number of mixed GGO is mainly distributed at the periphery in COVID-19 patients (p < 0.001), with no statistical difference in the central area.
T205 18916-19001 Sentence denotes The consolidation lesions without GGO occurred less in COVID-19 patients (p = 0.001).
T206 19002-19143 Sentence denotes More lesions are between 1 and 3 cm (p = 0.027), and fewer lesions are larger than half of the lung segment (p = 0.017) in COVID-19 patients.
T207 19144-19502 Sentence denotes Other significant differences between the two groups include the pleural traction sign (p = 0.019), bronchial wall thickening (p < 0.001), interlobular septal thickening (p = 0.009), crazy paving (p < 0.001), tree-in-bud (p < 0.001), pleural effusions (p < 0.001), pleural thickening (p = 0.030), and the offending vessel augmentation in lesions (p < 0.001).
T208 19503-19598 Sentence denotes The lung score presents no significant difference between the COVID-19 and non-COVID-19 groups.
T209 19599-19715 Sentence denotes Comparison of clinical features between the two groups of patients with and without COVID-19 is reported in Table 2.
T210 19716-19789 Sentence denotes There is no significant difference in age and sex between the two groups.
T211 19790-19935 Sentence denotes Significant differences are found in common symptoms between groups, including fever (p = 0.003), dry cough (p = 0.025), and fatigue (p = 0.007).
T212 19936-20046 Sentence denotes The respiration rate and heart rate also show significant differences between the two groups (both p < 0.001).
T213 20047-20168 Sentence denotes Compared with non-COVID-19 pneumonia, the reduction of the WBC count is more pronounced in COVID-19 patients (p < 0.001).
T214 20169-20293 Sentence denotes The ratio of lymphocyte and ratio of neutrophil also show a significant difference between COVID-19 and non-COVID-19 groups.
T215 20294-20441 Sentence denotes Although lymphopenia was observed in 32 COVID-19 patients (45.71%), it is not statistically different compared with that in the non-COVID-19 group.
T216 20442-20593 Sentence denotes C-creative protein (CRP) level and procalcitonin level are also significantly different between the two groups (p < 0.001 and p = 0.007, respectively).
T217 20594-20661 Sentence denotes Most COVID-19 patients present normal procalcitonin level (82.86%).
T218 20663-20706 Sentence denotes Clinical and radiological feature selection
T219 20707-20851 Sentence denotes Of the features, 18 radiological features and 17 clinical features were selected to form the predictors based on the result from Tables 1 and 2.
T220 20852-20921 Sentence denotes Table 3 lists the features selected by univariate analysis and LASSO.
T221 20922-20970 Sentence denotes Table 3 Selected features in C, R, and CR models
T222 20971-21013 Sentence denotes Model and individual features Coefficients
T223 21014-21028 Sentence denotes R, n = 8 (41)*
T224 21029-21048 Sentence denotes   Intercept − 0.307
T225 21049-21101 Sentence denotes   Total number of mixed GGO in peripheral area 0.359
T226 21102-21141 Sentence denotes   Total number of consolidation − 1.262
T227 21142-21207 Sentence denotes   Total number of solid nodules with ground-glass opacities 0.452
T228 21208-21248 Sentence denotes   Interlobular septal thickening − 5.559
T229 21249-21277 Sentence denotes   Crazy paving pattern 3.566
T230 21278-21299 Sentence denotes   Tree-in-bud − 2.548
T231 21300-21326 Sentence denotes   Pleural thickening 3.265
T232 21327-21375 Sentence denotes   Offending vessel augmentation in lesions 5.504
T233 21376-21390 Sentence denotes C, n = 7 (26)*
T234 21391-21409 Sentence denotes   Intercept 29.273
T235 21410-21431 Sentence denotes   Respiration − 0.359
T236 21432-21452 Sentence denotes   Heart rate − 0.054
T237 21453-21474 Sentence denotes   Temperature − 0.289
T238 21475-21507 Sentence denotes   White blood cell count − 0.175
T239 21508-21523 Sentence denotes   Cough − 1.866
T240 21524-21539 Sentence denotes   Fatigue 2.855
T241 21540-21575 Sentence denotes   Lymphocyte count category − 0.028
T242 21576-21592 Sentence denotes CR, n = 10 (67)*
T243 21593-21611 Sentence denotes   Intercept 45.117
T244 21612-21664 Sentence denotes   Total number of mixed GGO in peripheral area 0.108
T245 21665-21686 Sentence denotes   Tree-in-bud − 1.853
T246 21687-21735 Sentence denotes   Offending vessel augmentation in lesions 6.000
T247 21736-21757 Sentence denotes   Respiration − 0.583
T248 21758-21779 Sentence denotes   Heart ratio − 0.084
T249 21780-21801 Sentence denotes   Temperature − 0.536
T250 21802-21834 Sentence denotes   White blood cell count − 0.471
T251 21835-21850 Sentence denotes   Cough − 0.997
T252 21851-21868 Sentence denotes   Fatigue − 0.228
T253 21869-21904 Sentence denotes   Lymphocyte count category − 2.177
T254 21905-22087 Sentence denotes C, R, and CR indicate the predicted model based on clinical features, radiological features, and the combination of clinical features and clinical radiological features, respectively
T255 22088-22173 Sentence denotes *n means corresponding selected features, and data in parentheses are total features.
T256 22174-22286 Sentence denotes Coefficients: the estimate value of each feature in multivariate logistic regression model by “glm” package in R
T257 22288-22320 Sentence denotes Model development and validation
T258 22321-22522 Sentence denotes The prediction models based on (i) clinical features (C model), (ii) radiological features (R model), and (iii) the combination of clinical features and radiological features (CR model) were developed.
T259 22523-22606 Sentence denotes ROC analyses for the primary and validation cohort are shown in Table 4 and Fig. 3.
T260 22607-22799 Sentence denotes The CR model yielded a maximum AUC of 0.986 (95% CI 0.966~1.000) in the primary cohort with the highest accuracy and specificity, which was 0.936 (95% CI 0.866~1.000) in the validation cohort.
T261 22800-22938 Sentence denotes The AUC for the C model was 0.952 (95% CI 0.988~0.915) and 0.967 (95% CI 0.919~1.000) in the primary and validation cohorts, respectively.
T262 22939-23059 Sentence denotes For the R model, the AUC of the two cohorts was 0.969 (95% CI 0.940~0.997) and 0.809 (95% CI 0.669~0.948), respectively.
T263 23060-23119 Sentence denotes Table 4 Performance of the individualized prediction models
T264 23120-23170 Sentence denotes Primary cohort (n = 98) Validation cohort (n = 38)
T265 23171-23265 Sentence denotes Models AUC 95% CI Accuracy Specificity Sensitivity AUC 95% CI Accuracy Specificity Sensitivity
T266 23266-23345 Sentence denotes C model 0.952 0.915~0.988 0.888 0.894 0.882 0.967 0.919~1.000 0.868 0.859 0.842
T267 23346-23425 Sentence denotes R model 0.969 0.940~0.997 0.929 0.851 1.000 0.809 0.669~0.948 0.684 0.368 1.000
T268 23426-23506 Sentence denotes CR model 0.986 0.966~1.000 0.959 0.957 0.961 0.936 0.866~1.000 0.763 0.789 0.737
T269 23507-23690 Sentence denotes C, R, and CR indicate the predicted model based on clinical features, radiological features, and the combination of clinical features and clinical radiological features, respectively.
T270 23691-23713 Sentence denotes CI confidence interval
T271 23714-23785 Sentence denotes Fig. 3 ROC of the three models in primary and validation cohort curves.
T272 23786-24053 Sentence denotes Comparison of receiver operating characteristic (ROC) curves among the radiological mode (R model), clinical model (C model), and the combination of clinical and radiological model (CR model) for the diagnosis of COVID-19 in the primary (a) and validation (b) cohorts
T273 24054-24259 Sentence denotes To determine the clinical usefulness of the diagnostic model, we developed the decision curve (Fig. 4), which showed better performances for the CR model compared with that for the C model and the R model.
T274 24260-24443 Sentence denotes Across the majority of the range of reasonable threshold probabilities, the decision curve analysis showed that the CR model had a higher overall benefit than the C model and R model.
T275 24444-24513 Sentence denotes Fig. 4 Decision curve analysis for each model in the primary dataset.
T276 24514-24808 Sentence denotes The y-axis measures the net benefit, which is calculated by summing the benefits (true-positive findings) and subtracting the harms (false-positive findings), weighting the latter by a factor related to the relative harm of undetected metastasis compared with the harm of unnecessary treatment.
T277 24809-25067 Sentence denotes The decision curve shows that if the threshold probability is over 10%, the application of the combination of clinical and radiological model (CR model) to diagnose COVID-19 adds more benefit than the clinical model (C model) and radiological model (R model)
T278 25068-25415 Sentence denotes The nomogram (Fig. 5) was developed by the CR model in the primary cohort, with the factors of the total number of mixed GGO in peripheral area (TN_Mixed_GGO_IP), tree-in-bud, offending vessel augmentation in lesions (OVAIL), respiration, heart ratio, temperature, white blood cell count, cough, fatigue and lymphocyte count category incorporated.
T279 25416-25520 Sentence denotes The total points were calculated by summing the points identified on the “points” scale for each factor.
T280 25521-25655 Sentence denotes By comparing the “total points” scale and the “probability” scale, the individual probability of COVID-19 infection could be obtained.
T281 25656-25710 Sentence denotes Fig. 5 Nomogram of the CR model in the primary cohort.
T282 25711-25788 Sentence denotes TN_Mixed_GGO_IP represented the total number of mixed GGO in peripheral area.
T283 25789-25848 Sentence denotes AVAIL represented offending vessel segmentation in lesions.
T284 25849-25902 Sentence denotes N was a negative result, and P was a positive result.
T285 25903-25927 Sentence denotes Norm represented normal.
T286 25928-25990 Sentence denotes Note that in probability scale, 0 = non-COVID-19, 1 = COVID-19
T287 25992-26002 Sentence denotes Discussion
T288 26003-26168 Sentence denotes In this multi-center study, statistical analysis was performed in comparing imaging and clinical manifestations between pneumonia patients with and without COVID-19.
T289 26169-26321 Sentence denotes Eighteen radiological semantic features and seventeen clinical features were identified to be significantly different between the two groups (p < 0.05).
T290 26322-26403 Sentence denotes Three models for COVID-19 diagnosis were developed based on the refined features.
T291 26404-26510 Sentence denotes The models were validated in the both primary and validation cohorts and achieved an AUC as high as 0.986.
T292 26511-26702 Sentence denotes These models will play an essential role for early and easy-to-access diagnosis, especially when there are not enough RT-PCT kits or experimental platforms to test for the COVID-19 infection.
T293 26703-26804 Sentence denotes A total of 1745 lesions were evaluated for the qualitative feature, location, and size in this study.
T294 26805-26998 Sentence denotes Consistent with the previous studies, the ground-glass opacities and consolidation in the lung periphery were considered to be the imaging hallmark in patients with COVID-19 infection [11, 25].
T295 26999-27142 Sentence denotes However, when we subdivided the GGO into pure GGO and mixed GGO, we found that the distribution pattern is different between these two lesions.
T296 27143-27305 Sentence denotes Pure GGO show differences between groups in every location of the lungs, whereas mixed GGO only have significant differences between groups in the lung periphery.
T297 27306-27378 Sentence denotes Recent studies defined four stages of lung involvement in COVID-19 [26].
T298 27379-27455 Sentence denotes Therefore, a follow-up analysis of these distributions would be significant.
T299 27456-27544 Sentence denotes The lesion size in patients with COVID-19 infection was another interesting observation.
T300 27545-27688 Sentence denotes Most lesions were between 1 and 3 cm, with few lesions larger than half of the lung segment, which was similar to the finding in MERS_CoV [22].
T301 27689-27873 Sentence denotes Other features similar to MERS_CoV and SARS_CoV were observed in the laboratory abnormalities, such as lymphopenia, which may be associated with the cellular immune deficiency [3, 27].
T302 27874-27990 Sentence denotes However, our results showed no significant difference in lymphopenia between the COVID-19 and non-COVID-19 patients.
T303 27991-28122 Sentence denotes To our knowledge, no diagnostic model based on imaging and clinical features alone has been proposed for the diagnosis of COVID-19.
T304 28123-28428 Sentence denotes Our clinical and radiological semantic (CR) models consisted of the following features: total number of GGO with consolidation in the peripheral area, tree-in-bud, offending vessel augmentation in lesions, temperature, heart ratio, respiration, cough and fatigue, WBC count, and lymphocyte count category.
T305 28429-28500 Sentence denotes The CR model outperformed the individual clinical and radiologic model.
T306 28501-28700 Sentence denotes This result was in accordance with that in previous study in breast cancer, in which the model based on the combination of radiomics features and clinical features achieved a higher performance [24].
T307 28701-28894 Sentence denotes Compared with the radiomics-based model, the extraction of radiological semantic features can overcome the image discrepancy caused by different scanning parameters and/or different CT vendors.
T308 28895-29091 Sentence denotes A previous study [28] also indicated that models based on semantic features determined by an experienced thoracic radiologist slightly outperformed models based on computed texture features alone.
T309 29092-29134 Sentence denotes There are a few limitations in this study.
T310 29135-29293 Sentence denotes First, the sample size is relatively small because this is a retrospective analysis of a new disease and most of the cases outside of Wuhan City are imported.
T311 29294-29611 Sentence denotes Second, with the multi-center retrospective design, there is a potential bias of patient selection [29], since there may be some deviations in marking semantic features among readers, though we have taken the effort to reduce this by creating pictorial examples and setting feature criteria (Supplementary Materials).
T312 29612-29659 Sentence denotes Third, longitudinal CT study was not performed.
T313 29660-29799 Sentence denotes Whether or not this model can be used to evaluate the follow-ups and help to guide therapy remains an open question to be further explored.
T314 29800-29983 Sentence denotes Moreover, the rich high-order features of the CT image combined with radiomics or deep learning have not been studied, which may be another way to identify the patients with COVID-19.
T315 29984-30119 Sentence denotes Besides, one can also focus on the role of radiological features in disease monitoring, treatment evaluation, and prognosis prediction.
T316 30120-30231 Sentence denotes In conclusion, 1745 lesions and 67 features were compared between pneumonia patients with and without COVID-19.
T317 30232-30305 Sentence denotes Thirty-five features were significantly different between the two groups.
T318 30306-30476 Sentence denotes A diagnostic model with AUC as high as 0.986 was developed and validated both in the primary and in the validation cohorts, which may help improve the COVID-19 diagnosis.
T319 30478-30511 Sentence denotes Electronic supplementary material
T320 30513-30532 Sentence denotes ESM 1 (DOCX 404 kb)
T321 30534-30550 Sentence denotes Publisher’s note
T322 30551-30669 Sentence denotes Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
T323 30670-30733 Sentence denotes Xiaofeng Chen and Yanyan Tang contributed equally to this work.
T324 30735-30754 Sentence denotes Funding information
T325 30755-31116 Sentence denotes This study has received funding by the Natural Science Foundation of China (grant numbers 81471730, 31870981) to R.W.; the Natural Science Foundation of Guangdong Province (grant number 2018A030307057) to Z.D.; and the Special Project on Prevention and Control of COVID-19 for Colleges and Universities in Guangdong Province (grant number 2020KZDZX1085) to Z.D.
T326 31118-31151 Sentence denotes Compliance with ethical standards
T327 31153-31162 Sentence denotes Guarantor
T328 31163-31223 Sentence denotes The scientific guarantor of this publication is Zhuozhi Dai.
T329 31225-31245 Sentence denotes Conflict of interest
T330 31246-31330 Sentence denotes One of the authors of this manuscript (Yuting Liao) is an employee of GE Healthcare.
T331 31331-31476 Sentence denotes The remaining authors declare no relationships with any companies whose products or services may be related to the subject matter of the article.
T332 31478-31501 Sentence denotes Statistics and biometry
T333 31502-31577 Sentence denotes One of the authors, Dr. Yuting Liao, has significant statistical expertise.
T334 31579-31595 Sentence denotes Informed consent
T335 31596-31666 Sentence denotes Written informed consent was waived by the Institutional Review Board.
T336 31668-31684 Sentence denotes Ethical approval
T337 31685-31734 Sentence denotes Institutional Review Board approval was obtained.
T338 31736-31747 Sentence denotes Methodology
T339 31748-31763 Sentence denotes • retrospective
T340 31764-31784 Sentence denotes • case-control study
T341 31785-31805 Sentence denotes • multi-center study