Experiment-A In this experiment, we used the X-data of the XPVS where the normal cases were from the RSNA data set and the COVID-19 cases were from the COVID CXR data set (CCD) data set. The results of the five evaluation indicators for the comparison of the COVID-19 cases and normal cases of the XPVS are shown in Table 2. An excellent performance was obtained, with the best score of specificity of 99.33% and a precision of 98.33%. The F1 score was 96.72%, which was higher than that of the Respire. (96.12%), the Emerg. (93.94%), the Intern (84.67%), and the Rad-3rd (85.93%) and lower than that of the Rad-5th (98.41%). The kappa index was 95.40%, which was higher than that of the Respire. (94.43%), the Emerg. (91.21%), the Intern (77.45%), and the Rad-3rd (79.42%), and lower than that of the Rad-5th (97.74%). The sensitivity index was 95.16%, which was higher than that of the Intern (93.55%) and the Rad-3rd (93.55%) and lower than that of the Respire. (100%), the Emerg. (100%) and Rad-5th (100%). The receiver operating characteristic (ROC) scores for the CNNCF and the experts are plotted in Fig. 4a; the area under the ROC curve (AUROC) of the CNNCF is 0.9961. The precision-recall scores for the CNNCF and the experts are plotted in Fig. 4d; the area under the precision-recall curve (AUPRC) of the CNNCF is 0.9910. Table 2 Performance indices of the classification framework (CNNCF) of experiment A and the average performance of the 7th-year respiratory resident (Respira.), the 3rd-year emergency resident (Emerg.), the 1st-year respiratory intern (Intern), the 5th-year radiologist (Rad-5th), and the 3rd-year radiologist (Rad-3rd). F1 (95% CI) Kappa (95% CI) Specificity (95% CI) Sensitivity (95% CI) Precision (95% CI) CNNCF 0. 9672 (0.9307, 0.9890) 0.9540 (0.9030, 0.9924) 0.9933 (0.9792, 1.0000) 0.9516 (0.8889, 1.0000) 0. 9833 (0.9444, 1.0000) Respire. 0.9612 (0.9231, 0.9920) 0.9443 (0.8912, 0.9887) 0.9667 (0.9363, 0.9933) 1.0000 (1.0000, 1.0000) 0.9254 (0.8095, 0.9571) Emerg. 0. 9394 (0.8947, 0.9781) 0.9121 (0.8492, 0.9677) 0.9467 (0.9091, 0.9797) 1.0000 (1.0000, 1.0000) 0.8857 (0.8095, 0.9571) Intern. 0.8467 (0.7692, 0.9041) 0.7745 (0.6730, 0.8592) 0.8867 (0.8333, 0.9343) 0.9355 (0.8596, 0.984) 0.7733 (0.6708, 0.8649) Rad-5th 0.9841 (0.9593, 1.0000) 0.9774 (0.9433, 1.0000) 0.9867 (0.9662, 1.0000) 1.0000 (1.0000, 1.0000) 0.9688 (0.9219, 1.0000) Rad-3rd 0.8593 (0.7931, 0.9180) 0.7942 (0.7062, 0.8779) 0.9000 (0.8541, 0.9481) 0.9355 (0.8666, 0.9841) 0.7945 (0.6974, 0.8873) Fig. 4 ROC and PRC curves for the CNNCF of the experiments A-C. NC indicates that the positive case is a COVID-19 case, and the negative case is *Normal. CI indicates that the positive case is COVID-19, and the negative case is influenza. The points are the results of experts, corresponding to the results in Tables 2 and 3. The background gray dashed curves in the PRC curve correspond to the iso-F1 curves. a ROC curve for the NC using X-data. b ROC curve for the NC using CT-data. c ROC curve for the CI using CT-data. d PRC curve for the NC using X-data. e PRC curve for the NC using CT-data. f PRC curve for the CI using CT-data.