> top > docs > PMC:7782580 > spans > 9823-30388 > annotations

PMC:7782580 / 9823-30388 JSONTXT

Annnotations TAB JSON ListView MergeView

LitCovid-PubTator

Id Subject Object Predicate Lexical cue tao:has_database_id
112 774-783 Disease denotes Pneumonia MESH:D011014
113 817-825 Disease denotes COVID-19 MESH:C000657245
114 934-943 Disease denotes Pneumonia MESH:D011014
115 979-987 Disease denotes COVID-19 MESH:C000657245
119 1143-1151 Disease denotes COVID-19 MESH:C000657245
120 1167-1176 Disease denotes pneumonia MESH:D011014
121 1218-1226 Disease denotes COVID-19 MESH:C000657245
123 1297-1305 Disease denotes COVID-19 MESH:C000657245
127 1397-1402 Species denotes human Tax:9606
128 1425-1430 Species denotes human Tax:9606
129 2718-2726 Disease denotes COVID-19 MESH:C000657245
131 3939-3947 Disease denotes COVID-19 MESH:C000657245
135 4804-4812 Disease denotes COVID-19 MESH:C000657245
136 4897-4905 Disease denotes COVID-19 MESH:C000657245
137 5182-5192 Disease denotes infections MESH:D007239
141 4078-4086 Disease denotes COVID-19 MESH:C000657245
142 4345-4353 Disease denotes COVID-19 MESH:C000657245
143 4440-4448 Disease denotes COVID-19 MESH:C000657245
146 7209-7217 Disease denotes COVID-19 MESH:C000657245
147 7405-7413 Disease denotes COVID-19 MESH:C000657245
150 6970-6974 Species denotes CATS Tax:9685
151 6886-6890 Disease denotes XPDS OMIM:300911
154 9798-9801 Gene denotes Rad Gene:6236
155 9670-9673 Gene denotes Rad Gene:6236
158 10031-10039 Disease denotes COVID-19 MESH:C000657245
159 10119-10127 Disease denotes COVID-19 MESH:C000657245
163 7539-7547 Disease denotes COVID-19 MESH:C000657245
164 7568-7573 Disease denotes COVID MESH:C000657245
165 7675-7683 Disease denotes COVID-19 MESH:C000657245
167 11664-11672 Disease denotes COVID-19 MESH:C000657245
169 12552-12560 Disease denotes COVID-19 MESH:C000657245
174 11155-11158 Gene denotes Rad Gene:6236
175 10698-10706 Disease denotes COVID-19 MESH:C000657245
176 10815-10823 Disease denotes COVID-19 MESH:C000657245
177 10942-10950 Disease denotes COVID-19 MESH:C000657245
181 13516-13524 Disease denotes COVID-19 MESH:C000657245
182 13637-13645 Disease denotes COVID-19 MESH:C000657245
183 13740-13748 Disease denotes COVID-19 MESH:C000657245
185 14787-14795 Disease denotes COVID-19 MESH:C000657245
188 14853-14861 Disease denotes COVID-19 MESH:C000657245
189 14941-14949 Disease denotes COVID-19 MESH:C000657245
195 16081-16090 Disease denotes pneumonia MESH:D011014
196 16293-16302 Disease denotes pneumonia MESH:D011014
197 16313-16321 Disease denotes COVID-19 MESH:C000657245
198 16697-16705 Disease denotes COVID-19 MESH:C000657245
199 16707-16716 Disease denotes pneumonia MESH:D011014
201 17355-17363 Disease denotes COVID-19 MESH:C000657245
205 17593-17601 Disease denotes COVID-19 MESH:C000657245
206 17951-17959 Disease denotes COVID-19 MESH:C000657245
207 18179-18187 Disease denotes COVID-19 MESH:C000657245
210 19453-19471 Gene denotes C-reactive protein Gene:1401
211 19476-19484 Disease denotes COVID-19 MESH:C000657245
213 20326-20330 Species denotes CATS Tax:9685

LitCovid-PD-HP

Id Subject Object Predicate Lexical cue hp_id
T10 774-783 Phenotype denotes Pneumonia http://purl.obolibrary.org/obo/HP_0002090
T11 934-943 Phenotype denotes Pneumonia http://purl.obolibrary.org/obo/HP_0002090
T12 1167-1176 Phenotype denotes pneumonia http://purl.obolibrary.org/obo/HP_0002090
T13 16081-16090 Phenotype denotes pneumonia http://purl.obolibrary.org/obo/HP_0002090
T14 16293-16302 Phenotype denotes pneumonia http://purl.obolibrary.org/obo/HP_0002090
T15 16707-16716 Phenotype denotes pneumonia http://purl.obolibrary.org/obo/HP_0002090

LitCovid-sentences

Id Subject Object Predicate Lexical cue
T62 0-7 Sentence denotes Results
T63 9-28 Sentence denotes Data set properties
T64 29-92 Sentence denotes Multi-modal data from multiple sources were used in this study.
T65 93-240 Sentence denotes X-data, CT-data, and clinical data used in our research were collected from four public data sets and one frontline hospital data (Youan hospital).
T66 241-407 Sentence denotes Each data set was divided into two parts: train-val part and test part using a train-test-split function (TTSF) of the scikit-learn library which is shown in Table 1.
T67 408-525 Sentence denotes The details of the multi-modal data types are described in the “Methods” section (see “Data sets splitting” section).
T68 526-643 Sentence denotes Table 1 Number of cases from four public data sets and the Youan hospital (X-data, CT-data, clinical indicator data).
T69 644-678 Sentence denotes Study X-data CT-data Clinical data
T70 679-729 Sentence denotes Train + Val Test Train + Val Test Train + Val Test
T71 730-773 Sentence denotes *Normal (RSNA + LUNA16) 5000 100 100 20 – –
T72 774-816 Sentence denotes Pneumonia (RSNA + ICNP) 3000 100 83 20 – –
T73 817-846 Sentence denotes COVID-19 (CCD) 150 62 – – – –
T74 847-890 Sentence denotes Influenza (Youan Hospital) 100 45 35 15 – –
T75 891-933 Sentence denotes *Normal (Youan Hospital) 478 25 139 20 – –
T76 934-978 Sentence denotes Pneumonia (Youan Hospital) 380 55 180 35 – –
T77 979-1022 Sentence denotes COVID-19 (Youan Hospital) 35 10 75 20 75 20
T78 1023-1051 Sentence denotes Total 9143 397 612 130 75 20
T79 1052-1239 Sentence denotes The term *Normal in this work means the cases where the lungs are not manifest evidence of COVID-19, influenza, or pneumonia on imaging and the RT-PCR testing of the COVID-19 is negative.
T80 1241-1341 Sentence denotes A platform was developed for annotating lesion areas of COVID-19 in medical images (X-data, CT-data)
T81 1342-1505 Sentence denotes Medical imaging uses images of internal tissues of the human body or a part of the human body in a non-invasive manner for clinical diagnoses or treatment plans36.
T82 1506-1692 Sentence denotes Medical images (e.g., X-data and CT-data) are usually acquired using computed radiography and are typically stored in the Digital Imaging and Communications in Medicine (DICOM) format37.
T83 1693-1872 Sentence denotes X-data are two-dimensional grayscale images, and CT-data are three-dimensional data, consisting of slices of the data in the z axis direction of a two-dimensional grayscale image.
T84 1873-1988 Sentence denotes Machine learning methods are playing increasingly important roles in medical image analysis, especially DL methods.
T85 1989-2109 Sentence denotes DL uses multiple non-linear transformations to create a mapping relationship between the input data and output labels38.
T86 2110-2204 Sentence denotes The objective of this study was to annotate lesion areas in medical images with high accuracy.
T87 2205-2393 Sentence denotes Therefore, we developed a pseudo-coloring method, which is a technique that helps enhance medical images for physicians to isolate relevant tissues and groups different tissues together39.
T88 2394-2554 Sentence denotes We converted the original grayscale images to color images using the open-source image processing tools Open Source Computer Vision Library (OpenCV) and Pillow.
T89 2555-2612 Sentence denotes Examples of the pseudo-color images are shown in Fig. 1a.
T90 2613-2752 Sentence denotes We developed a platform that uses a client-server architecture to annotate the potential lesion areas of COVID-19 on the CXR and CT images.
T91 2753-2832 Sentence denotes The platform can be deployed on a private cloud for security and local sharing.
T92 2833-2991 Sentence denotes All the images were annotated by two experienced radiologists (one was a 5th-year radiologist and the other was a 3rd-year radiologist) in the Youan Hospital.
T93 2992-3154 Sentence denotes If there was disagreement about a result, a senior radiologist and a respiratory doctor made the final decision to ensure the precision of the annotation process.
T94 3155-3228 Sentence denotes The details of the annotation pipeline are shown in Supplementary Fig. 1.
T95 3229-3335 Sentence denotes Fig. 1 Demonstrations of data preprocessing methods including pseudo-coloring and dimension normalization.
T96 3336-3401 Sentence denotes a Pseudo-coloring for abnormal examples in the CXR and CT images.
T97 3402-3534 Sentence denotes The original grayscale images were transformed into color images using the pseudo-coloring method and were annotated by the experts.
T98 3535-3678 Sentence denotes The scale bar on the right is the range of pixel values of the image data. b Dimension normalization to reduce the dimensions in the CT images.
T99 3679-3782 Sentence denotes The number of CT images were first resampled to a multiple of three and then divided into three groups.
T100 3783-3861 Sentence denotes Followed by the 1 × 1 convolution layers to reduce the dimensions of the data.
T101 3863-3976 Sentence denotes PCA was used to determine the characteristics of the medical images for the COVID-19, influenza, and normal cases
T102 3977-4138 Sentence denotes PCA was used to visually compare the characteristics of the medical images (X-data, CT-data) for the COVID-19 cases with those of the normal and influenza cases.
T103 4139-4295 Sentence denotes Figure 2a shows the mean image of each category and the five eigenvectors that represent the principal components of PCA in the corresponding feature space.
T104 4296-4487 Sentence denotes Significant differences are observed between the COVID-19, influenza, and normal cases, indicating the possibility of being able to distinguish COVID-19 cases from normal and influenza cases.
T105 4488-4562 Sentence denotes Fig. 2 PCA visualizations and example heatmaps of both X-data and CT-data.
T106 4563-4625 Sentence denotes a Mean image and eigenvectors of five different sub-data sets.
T107 4626-4708 Sentence denotes The first column shows the mean image and the other columns show the eigenvectors.
T108 4709-4916 Sentence denotes The first row shows the mean image and five eigenvectors of the normal CXR images; second row: COVID-19 CXR images, third row: normal CT images, fourth row: influenza CT images, last row: COVID-19 CT images.
T109 4917-5103 Sentence denotes The scale bar on the right is the range of pixel values of the image data. b Heatmaps of both X-data and CT-data were demonstrated for better interpretability of the proposed frameworks.
T110 5104-5193 Sentence denotes The scale bar on the right is the probability of the areas being suspected as infections.
T111 5195-5364 Sentence denotes The CNN-based classification framework exhibited excellent performance based on the validation by experts using multi-modal data from public data sets and Youan hospital
T112 5365-5583 Sentence denotes The structure of the proposed framework, consisting of the stage I sub-framework and the stage II sub-framework is shown in Fig. 3a, where Q, L, M, and N are the hyper-parameters of the framework for general use cases.
T113 5584-5713 Sentence denotes The values of Q, L, M, and N were 1, 1, 2, and 2, respectively, in this study; this framework referred to as the CNNCF framework.
T114 5714-5872 Sentence denotes The stage I and stage II sub-frameworks were designed to extract features corresponding to different optimization goals in the analysis of the medical images.
T115 5873-6118 Sentence denotes The performance of the CNNCF was evaluated using multi-modal data sets (X-data and CT-data) to ensure the generalization and transferability of the model, and five evaluation indicators were used (sensitivity, precision, specificity, F1, kappa).
T116 6119-6243 Sentence denotes The salient features of the images extracted by the CNNCF were visualized in a heatmap (four examples are shown in Fig. 2b).
T117 6244-6492 Sentence denotes In this study, multiple experiments were conducted (including experiments that included data from the same source and from different sources) to validate the generalization ability of the framework while avoiding the possible sample selection bias.
T118 6493-6729 Sentence denotes Five experts evaluated the images, i.e., a 7th-year respiratory resident (Respira.), a 3rd-year emergency resident (Emerg.), a 1st-year respiratory intern (Intern), a 5th-year radiologist (Rad-5th), and a 3rd-year radiologist (Rad-3rd).
T119 6730-6802 Sentence denotes The definition of the expert group can be found in Supplementary Note 1.
T120 6803-7085 Sentence denotes The abbreviations of all the data sets used in the following experiments including XPDS, XPTS, XPVS, XHDS, XHTS, XHVS, CTPDS, CTPTS, CTPVS, CTHDS, CTHTS, CTHVS, CADS, CATS, CAVS, XMTS, XMVS, CTMTS, and CTMVS were defined in the “Methods” section (see “Data sets splitting” section).
T121 7086-7122 Sentence denotes The following results were obtained.
T122 7123-7151 Sentence denotes Fig. 3 CNN-based frameworks.
T123 7152-7414 Sentence denotes a The classification framework for the identification of COVID-19. b The regression framework for the correlation analysis between the lesion areas and the clinical indicators. c is the workflow of the classification framework for the identification of COVID-19.
T124 7416-7428 Sentence denotes Experiment-A
T125 7429-7602 Sentence denotes In this experiment, we used the X-data of the XPVS where the normal cases were from the RSNA data set and the COVID-19 cases were from the COVID CXR data set (CCD) data set.
T126 7603-7740 Sentence denotes The results of the five evaluation indicators for the comparison of the COVID-19 cases and normal cases of the XPVS are shown in Table 2.
T127 7741-7851 Sentence denotes An excellent performance was obtained, with the best score of specificity of 99.33% and a precision of 98.33%.
T128 7852-8041 Sentence denotes The F1 score was 96.72%, which was higher than that of the Respire. (96.12%), the Emerg. (93.94%), the Intern (84.67%), and the Rad-3rd (85.93%) and lower than that of the Rad-5th (98.41%).
T129 8042-8235 Sentence denotes The kappa index was 95.40%, which was higher than that of the Respire. (94.43%), the Emerg. (91.21%), the Intern (77.45%), and the Rad-3rd (79.42%), and lower than that of the Rad-5th (97.74%).
T130 8236-8426 Sentence denotes The sensitivity index was 95.16%, which was higher than that of the Intern (93.55%) and the Rad-3rd (93.55%) and lower than that of the Respire. (100%), the Emerg. (100%) and Rad-5th (100%).
T131 8427-8592 Sentence denotes The receiver operating characteristic (ROC) scores for the CNNCF and the experts are plotted in Fig. 4a; the area under the ROC curve (AUROC) of the CNNCF is 0.9961.
T132 8593-8748 Sentence denotes The precision-recall scores for the CNNCF and the experts are plotted in Fig. 4d; the area under the precision-recall curve (AUPRC) of the CNNCF is 0.9910.
T133 8749-9069 Sentence denotes Table 2 Performance indices of the classification framework (CNNCF) of experiment A and the average performance of the 7th-year respiratory resident (Respira.), the 3rd-year emergency resident (Emerg.), the 1st-year respiratory intern (Intern), the 5th-year radiologist (Rad-5th), and the 3rd-year radiologist (Rad-3rd).
T134 9070-9157 Sentence denotes F1 (95% CI) Kappa (95% CI) Specificity (95% CI) Sensitivity (95% CI) Precision (95% CI)
T135 9158-9166 Sentence denotes CNNCF 0.
T136 9167-9263 Sentence denotes 9672 (0.9307, 0.9890) 0.9540 (0.9030, 0.9924) 0.9933 (0.9792, 1.0000) 0.9516 (0.8889, 1.0000) 0.
T137 9264-9285 Sentence denotes 9833 (0.9444, 1.0000)
T138 9286-9294 Sentence denotes Respire.
T139 9295-9414 Sentence denotes 0.9612 (0.9231, 0.9920) 0.9443 (0.8912, 0.9887) 0.9667 (0.9363, 0.9933) 1.0000 (1.0000, 1.0000) 0.9254 (0.8095, 0.9571)
T140 9415-9421 Sentence denotes Emerg.
T141 9422-9424 Sentence denotes 0.
T142 9425-9542 Sentence denotes 9394 (0.8947, 0.9781) 0.9121 (0.8492, 0.9677) 0.9467 (0.9091, 0.9797) 1.0000 (1.0000, 1.0000) 0.8857 (0.8095, 0.9571)
T143 9543-9550 Sentence denotes Intern.
T144 9551-9669 Sentence denotes 0.8467 (0.7692, 0.9041) 0.7745 (0.6730, 0.8592) 0.8867 (0.8333, 0.9343) 0.9355 (0.8596, 0.984) 0.7733 (0.6708, 0.8649)
T145 9670-9797 Sentence denotes Rad-5th 0.9841 (0.9593, 1.0000) 0.9774 (0.9433, 1.0000) 0.9867 (0.9662, 1.0000) 1.0000 (1.0000, 1.0000) 0.9688 (0.9219, 1.0000)
T146 9798-9925 Sentence denotes Rad-3rd 0.8593 (0.7931, 0.9180) 0.7942 (0.7062, 0.8779) 0.9000 (0.8541, 0.9481) 0.9355 (0.8666, 0.9841) 0.7945 (0.6974, 0.8873)
T147 9926-9989 Sentence denotes Fig. 4 ROC and PRC curves for the CNNCF of the experiments A-C.
T148 9990-10079 Sentence denotes NC indicates that the positive case is a COVID-19 case, and the negative case is *Normal.
T149 10080-10164 Sentence denotes CI indicates that the positive case is COVID-19, and the negative case is influenza.
T150 10165-10251 Sentence denotes The points are the results of experts, corresponding to the results in Tables 2 and 3.
T151 10252-10561 Sentence denotes The background gray dashed curves in the PRC curve correspond to the iso-F1 curves. a ROC curve for the NC using X-data. b ROC curve for the NC using CT-data. c ROC curve for the CI using CT-data. d PRC curve for the NC using X-data. e PRC curve for the NC using CT-data. f PRC curve for the CI using CT-data.
T152 10563-10575 Sentence denotes Experiment-B
T153 10576-10742 Sentence denotes In this experiment, we used the CT-data of the CTPVS and CTHVS where the normal cases were from the LUNA data set and the COVID-19 cases were from the Youan hospital.
T154 10743-10976 Sentence denotes The results of the five evaluation indicators for the comparison of the COVID-19 cases and normal cases of the CTHVS and the CTPVS are shown in Table 3, where the normal cases are from CTPVS and the COVID-19 cases are from the CTHVS.
T155 10977-11163 Sentence denotes The CNNCF exhibits good performance for the five evaluation indices, which are similar to that of the Respire. and the Rad-5th and higher than that of the Intern, the Emerg. and Rad-3rd.
T156 11164-11233 Sentence denotes The ROC scores are plotted in Fig. 4b; the AUROC of the CNNCF is 1.0.
T157 11234-11314 Sentence denotes The precision-recall scores are shown in Fig. 4e; the AUPRC of the CNNCF is 1.0.
T158 11315-11647 Sentence denotes Table 3 Performance indices of the classification framework (CNNCF) of the experiments B and C, and the average performance of the 7th-year respiratory resident (Respira.), the 3rd-year emergency resident (Emerg.), the 1st-year respiratory intern (Intern), the 5th-year radiologist (Rad-5th), and the 3rd-year radiologist (Rad-3rd).
T159 11648-11679 Sentence denotes CT (*Normal and COVID-19 cases)
T160 11680-11694 Sentence denotes CNNCF Respire.
T161 11695-11701 Sentence denotes Emerg.
T162 11702-11709 Sentence denotes Intern.
T163 11710-11725 Sentence denotes Rad-5th Rad-3rd
T164 11726-11881 Sentence denotes F1 (95% CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8571, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8667, 1.0000)
T165 11882-12040 Sentence denotes Kappa (95% CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.7422, 1.0000) 1.0000 (1.0000, 1.0000) 0.9000 (0.7487, 1.0000)
T166 12041-12205 Sentence denotes Specificity (95% CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8333, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8333, 1.0000)
T167 12206-12370 Sentence denotes Sensitivity (95% CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8333, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8421, 1.0000)
T168 12371-12533 Sentence denotes Precision (95% CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8235, 1.0000) 1.0000 (1.0000, 1.0000) 0.9500 (0.8333, 1.0000)
T169 12534-12567 Sentence denotes CT (Influenza and COVID-19 cases)
T170 12568-12582 Sentence denotes CNNCF Respire.
T171 12583-12589 Sentence denotes Emerg.
T172 12590-12597 Sentence denotes Intern.
T173 12598-12613 Sentence denotes Rad-5th Rad-3rd
T174 12614-12768 Sentence denotes F1 (95%CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.8966 (0.7332, 1.0000) 0.8000 (0.6207, 0.9412) 0.9677 (0.8889, 1.0000) 0.8667 (0.7199, 0.9744)
T175 12769-12857 Sentence denotes Kappa (95%CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.8236 (0.5817, 1.0000) 0.
T176 12858-12927 Sentence denotes 6500 (0.3698, 0.8852) 0.9421 (0.8148, 1.0000) 0.7667 (0.5349, 0.9429)
T177 12928-13091 Sentence denotes Specificity (95%CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9048 (0.7619, 1.0000) 0.8500 (0.6818, 1.0000) 0.9500 (0.8333, 1.0000) 0.9000 (0.7619, 1.0000)
T178 13092-13255 Sentence denotes Sensitivity (95%CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.9286 (0.7500, 1.0000) 0.8000 (0.5714, 1.0000) 1.0000 (1.0000, 1.0000) 0.8667 (0.6667, 1.0000)
T179 13256-13417 Sentence denotes Precision (95%CI) 1.0000 (1.0000, 1.0000) 1.0000 (1.0000, 1.0000) 0.8667 (0.6874, 1.0000) 0.8000 (0.5881, 1.0000) 0.9375 (0.8000, 1.0000) 0.8667 (0.6667, 1.0000)
T180 13419-13431 Sentence denotes Experiment-C
T181 13432-13564 Sentence denotes In this experiment, we used the CT-data of the CTHVS where the normal cases and the COVID-19 cases were all from the Youan hospital.
T182 13565-13778 Sentence denotes The results of the five evaluation indicators for the comparison of the COVID-19 cases and influenza cases of the CTHVS are shown in Table 3 where the influenza cases and the COVID-19 cases are all from the CTHVS.
T183 13779-13872 Sentence denotes The CNNCF achieved the highest performance and the best score of all five evaluation indices.
T184 13873-13942 Sentence denotes The ROC scores are plotted in Fig. 4c; the AUROC of the CNNCF is 1.0.
T185 13943-14027 Sentence denotes The precision-recall scores are shown in Fig. 4f, and the AUPRC of the CNNCF is 1.0.
T186 14029-14041 Sentence denotes Experiment-D
T187 14042-14303 Sentence denotes The boxplots of the five evaluation indicators, the F1 score (Fig. 5a, d, g), the kappa coefficient (Fig. 5b, e, h), and the specificity (Fig. 5c, f, i) of experiments A–C are shown in Fig. 5, and the precision and sensitivity are shown in Supplementary Fig. 2.
T188 14304-14471 Sentence denotes A bootstrapping method40 was used to calculate the empirical distributions, and McNemar’s test41 was used to analyze the differences between the CNNCF and the experts.
T189 14472-14687 Sentence denotes The p-values of the McNemar’s test (Supplementary Tables 1–3) for the five evaluation indicators were all 1.0, indicating no statistically significant difference between the CNNCF results and the expert evaluations.
T190 14688-14811 Sentence denotes Fig. 5 Boxplots of the F1 score, kappa score, and specificity for the CNNCF and expert results for COVID-19 identification.
T191 14812-14901 Sentence denotes NC indicates that the positive case is a COVID-19 case, and the negative case is *Normal.
T192 14902-14986 Sentence denotes CI indicates that the positive case is COVID-19, and the negative case is influenza.
T193 14987-15441 Sentence denotes Bootstrapping is used to generate n = 1000 resampled independent validation sets for the XVS and the CTVS. a F1 score for the NC using X-data. b Kappa score for the NC using X-data. c Specificity for the NC using X-data. d F1 score for the NC using CT-data. e Kappa score for the NC using CT-data. f Specificity for the NC using CT-data. g F1 score for the CI using CT-data. h Kappa score for the CI using CT-data. i Specificity for the CI using CT-data.
T194 15442-15703 Sentence denotes We also conducted extra experiments with both configurations of the same data source and different data sources: the descriptions and graph charts can be found in the Supplementary Experiments and Tables (Supplementary Tables 4–19 and Supplementary Figs. 3–18).
T195 15704-15794 Sentence denotes The data used in experiments E–G were CTHVS and the data were all from the Youan hospital.
T196 15795-15884 Sentence denotes The data used in experiments H–K were XHVS and the data were all from the Youan hospital.
T197 15885-15938 Sentence denotes The data used in experiments L–N were XPVS and CTPVS.
T198 15939-16155 Sentence denotes The data used in the experiment L was from the same data set RSNA, while the data used in experiment M was from different data sets where the pneumonia cases were from the ICNP, and the normal cases were from LUNA16.
T199 16156-16350 Sentence denotes The data used in the experiments O–R, from the four public data sets and one hospital (Youan hospital) data set (including normal cases, pneumonia cases and COVID-19 cases), were XMVS and CTMVS.
T200 16351-16429 Sentence denotes In all the experiments (experiments A–R), the CNNCF achieved good performance.
T201 16430-16590 Sentence denotes Notably, in order to obtain a more comprehensive evaluation of the CNNCF while further improving the usability in clinical practice, experiment-R was performed.
T202 16591-16763 Sentence denotes In the experiment-R, the CNNCF was used to distinguish three types of cases simultaneously (Including the COVID-19, pneumonia, and normal cases) on both the XMVS and CTMVS.
T203 16764-16961 Sentence denotes Good performances were obtained on the XMVS, with the best score of F1 score of 91.89%, kappa score of 89.74%, specificity of 97.14%, sensitivity of 94.44%, and a precision of 89.47%, respectively.
T204 16962-17084 Sentence denotes Excellent performances were obtained on the CTMVS, with the best score of the five evaluation indicators were all 100.00%.
T205 17085-17198 Sentence denotes The ROC score and PRC score in the experiment-R were also satisfactory which were shown in Supplementary Fig. 18.
T206 17199-17307 Sentence denotes The results of the experiment-R further demonstrated the effectiveness and robustness of the proposed CNNCF.
T207 17309-17363 Sentence denotes Image analysis identifies salient features of COVID-19
T208 17364-17503 Sentence denotes In clinical practice, the diagnostic decision of a clinician relies on the identification of the SAs in the medical images by radiologists.
T209 17504-17636 Sentence denotes The statistical results show that the performance of the CNNCF for the identification of COVID-19 is as good as that of the experts.
T210 17637-17740 Sentence denotes A comparison consisting of two parts was performed to evaluate the discriminatory ability of the CNNCF.
T211 17741-17901 Sentence denotes In the first part, we used Grad-CAM, which is a non-intrusive method to extract the salient features in medical images, to create a heatmap of the CNNCF result.
T212 17902-17992 Sentence denotes Figure 2b shows the heatmaps of four examples of COVID-19 cases in the X-data and CT-data.
T213 17993-18188 Sentence denotes In the second part, we used density-based spatial clustering of applications with noise (DBSCAN) to calculate the center pixel coordinates (CPC) of the salient features corresponding to COVID-19.
T214 18189-18235 Sentence denotes All CPCs were normalized to a range of 0 to 1.
T215 18236-18386 Sentence denotes Subsequently, we used a significance test (ST)42 to analyze the relationship between the CPC of the CNNCF output and the CPC annotated by the experts.
T216 18387-18636 Sentence denotes A good performance was obtained, with a mean square error (MSE) of 0.0108, a mean absolute error (MAE) of 0.0722, a root mean squared error (RMSE) of 0.1040, a correlation coefficient (r) of 0.9761, and a coefficient of determination (R2) of 0.8801.
T217 18638-18759 Sentence denotes A strong correlation was observed between the lesion areas detected by the proposed framework and the clinical indicators
T218 18760-18911 Sentence denotes In clinical practice, multiple clinical indicators are analyzed to determine whether further examinations (i.e., medical image examination) are needed.
T219 18912-18987 Sentence denotes These indicators can be used to assess the predictive ability of the model.
T220 18988-19089 Sentence denotes In addition, various examinations are required to perform an accurate diagnosis in clinical practice.
T221 19090-19180 Sentence denotes However, the correlations between the results of various examinations are often not clear.
T222 19181-19500 Sentence denotes We used the stage II sub-framework and the regressor block of the CNNRF to conduct a correlation analysis between the lesion areas detected by the framework and five clinical indicators (white blood cell count, neutrophil percentage, lymphocyte percentage, procalcitonin, C-reactive protein) of COVID-19 using the CADS.
T223 19501-19694 Sentence denotes The inputs of the CNNRF were the lesion area images of each case, and the output was a 5-dimensional vector describing the correlation between the lesion areas and the five clinical indicators.
T224 19695-19759 Sentence denotes The MAE, MSE, RMSE, r, and R2 were used to evaluate the results.
T225 19760-19907 Sentence denotes The ST and the Pearson correlation coefficient (PCC)43 were used to determine the correlation between the lesion areas and the clinical indicators.
T226 19908-20019 Sentence denotes A strong correlation was obtained, with MSE = 0.0163, MAE = 0.0941, RMSE = 0.1172, r = 0.8274, and R2 = 0.6465.
T227 20020-20113 Sentence denotes At a significance level of 0.001, the value of r was 1.27 times the critical value of 0.6524.
T228 20114-20224 Sentence denotes This result indicates a high and significant correlation between the lesion areas and the clinical indicators.
T229 20225-20296 Sentence denotes The PCC was 0.8274 (range of 0.8–1.0), indicating a strong correlation.
T230 20297-20360 Sentence denotes The CNNRF was trained on the CATS and evaluated using the CAVS.
T231 20361-20478 Sentence denotes The initial learning rate was 0.01, and the optimization function was the stochastic gradient descent (SGD) method44.
T232 20479-20565 Sentence denotes The parameters of the CNNRF were initialized using the Xavier initialization method45.