PubAnnotation

Id	Subject	Object	Predicate	Lexical cue
T312	0-7	Sentence	denotes	Methods
T313	9-28	Sentence	denotes	Data sets splitting
T314	29-205	Sentence	denotes	We used the multi-modal data sets from four public data sets and one hospital (Youan hospital) in our research and split the hybrid data set in the following manner.For X-data:
T315	206-348	Sentence	denotes	The CXR images of COVID-19 cases collected from the public CCD52 contained 212 patients diagnosed with COVID-19 and were resized to 512 × 512.
T316	349-422	Sentence	denotes	Each image contained 1–2 suspected areas with inflammatory lesions (SAs).
T317	423-522	Sentence	denotes	We also collected 5100 normal cases and 3100 pneumonia cases from another public data set (RSNA)53.
T318	523-744	Sentence	denotes	In addition, The CXR images collected from the Youan hospital contained 45 cases diagnosed with COVID-19, 503 normal cases, 435 cases diagnosed with pneumonia (not COVID-19 patients), and 145 cases diagnosed as influenza.
T319	745-851	Sentence	denotes	The CXR images collected from the Youan hospital were obtained using the Carestream DRX-Revolution system.
T320	852-969	Sentence	denotes	All the CXR images of COVID-19 cases were analyzed by the two experienced radiologists to determine the lesion areas.
T321	970-1149	Sentence	denotes	The X-data of the normal cases (XNPDS), that of the pneumonia cases (XPPDS), and that of the COVID-19 cases (XCPDS) from public data sets constituted the X public data set (XPDS).
T322	1150-1333	Sentence	denotes	The X-data of the normal cases (XNHDS), that of the pneumonia cases (XPHDS), and that of the COVID-19 cases (XCHDS) from the Youan hospital constituted the X hospital data set (XHDS).
T323	1334-1346	Sentence	denotes	For CT-data:
T324	1347-1829	Sentence	denotes	We collected CT-data of 120 normal cases from a public lung CT-data set (LUNA16, a large data set for automatic nodule detection in the lungs54), which was a subset of LIDC-IDRI (The LIDC-IDRI contains a total of 1018 helical thoracic CT scans collected using manufacturers from eight medical imaging companies including AGFA Healthcare, Carestream Health, Inc., Fuji Photo Film Co., GE Healthcare, iCAD, Inc., Philips Healthcare, Riverain Medical, and Siemens Medical Solutions)55.
T325	1830-1995	Sentence	denotes	It was confirmed by the two experienced radiologists from the Youan Hospital that no lesion areas of COVID-19, pneumonia, or influenza were present in the 120 cases.
T326	1996-2138	Sentence	denotes	We also collected the CT-data of pneumonia cases from a public data set (images of COVID-19 positive and negative pneumonia patients: ICNP)56.
T327	2139-2311	Sentence	denotes	The CT-data collected from the Youan hospital contained 95 patients diagnosed with COVID-19, 50 patients diagnosed with influenza and 215 patients diagnosed with pneumonia.
T328	2312-2480	Sentence	denotes	The images of the CT scans collected from the Youan hospital were obtained using the PHILIPS Brilliance iCT 256 system (Which was also used for the LIDC-IDRI data set).
T329	2481-2594	Sentence	denotes	The slice thickness of the CT scans was 5 mm, and the CT-data images were grayscale images with 512 × 512 pixels.
T330	2595-2785	Sentence	denotes	Areas with 2–5 SAs were annotated by the two experienced radiologists using a rapid keystroke-entry format in the images for each case, and these areas ranged from 16 × 16 to 64 × 64 pixels.
T331	2786-2937	Sentence	denotes	The CT-data of the normal cases (CTNPDS) and that of the pneumonia cases (CTPPDS) from the public data sets constituted the CT public data set (CTPDS).
T332	2938-3182	Sentence	denotes	The CT-data of the COVID-19 cases from the Youan hospital (CTCHDS), the influenza cases from the Youan hospital (CTIHDS), and the normal cases from the Youan hospital (CTNHDS) constituted the CT hospital (clinically-diagnosed) data set (CTHDS).
T333	3183-3211	Sentence	denotes	For clinical indicator data:
T334	3212-3438	Sentence	denotes	Five clinical indicators (white blood cell count, neutrophil percentage, lymphocyte percentage, procalcitonin, C-reactive protein) of 95 COVID-19 cases were obtained from the Youan hospital, as shown in Supplementary Table 20.
T335	3439-3695	Sentence	denotes	A total of 95 data pairs from the 95 COVID-19 cases (369 images of the lesion area and the 95 × 5 clinical indicators) were collected from the Youan hospital for the correlation analysis of the lesion areas of the COVID-19 and the five clinical indicators.
T336	3696-3803	Sentence	denotes	The images of the SAs and the clinical indicator data constituted the correlation analysis data set (CADS).
T337	3804-3923	Sentence	denotes	We split the XPDS, XHDS, CTPDS, CTHDS, and CADS into the training-validation (train-val) and test data sets using TTSF.
T338	3924-4030	Sentence	denotes	The details of the hybrid data sets for the public data sets and Youan hospital data are shown in Table 1.
T339	4031-4118	Sentence	denotes	The train-val part of CTHDS is referred to as CTHTS, and the test part is called CTHVS.
T340	4119-4260	Sentence	denotes	The same naming scheme was adopted for XPDS, XHDS, CTPDS, and CADS, i.e., XPTS, XPVS, XHTS, XHVS, CTPTS, CTPVS, CATS, and CAVS, respectively.
T341	4261-4445	Sentence	denotes	The training-validation part of the four public data sets and the hospital (Youan Hospital) data set were mixed for X-data and CT-data, which were named as XMTS and CTMTS respectively.
T342	4446-4519	Sentence	denotes	While the test parts were split in the same way and named XMVS and CTMVS.
T343	4521-4540	Sentence	denotes	Image preprocessing
T344	4541-4723	Sentence	denotes	All image data (X-data and CT-data) in the DICOM format were loaded using the Pydicom library (version 1.4.0) and processed as arrays using the Numpy library (version 1.16.0).X-data:
T345	4724-4907	Sentence	denotes	The two-dimensional array (x axis and y axis) of the image of the X-data (size of 512 × 512) was normalized to pixel values of 0–255 and stored in png format using the OpenCV library.
T346	4908-4976	Sentence	denotes	Each preprocessed image was resized to 512 × 512 and had 3 channels.
T347	4977-4985	Sentence	denotes	CT-data:
T348	4986-5147	Sentence	denotes	The array of the CT-data was three-dimensional (x axis, y axis, and z axis), and the length of the z axis was ~300, which represented the number of image slices.
T349	5148-5224	Sentence	denotes	Each image slice was two-dimensional (x axis and y axis, size of 512 × 512).
T350	5225-5412	Sentence	denotes	As shown in Fig. 1b, the array of the image was divided into three groups in the z axis direction, and each group contained 100 image slices (each case was resampled to 300 image slices).
T351	5413-5543	Sentence	denotes	The image slices in each group were processed using a window center of −600 and a window width of 2000 to extract the lung tissue.
T352	5544-5682	Sentence	denotes	The images of the CT-data with 300 image slices were normalized to pixel values of 0–255 and stored in npy format using the Numpy library.
T353	5683-5892	Sentence	denotes	A convolution filter was applied with three 1 × 1 convolution kernels to preprocess the CT-data, which is a trainable layer with the aim of normalizing the input; the image size was 512 × 512, with 3 channels.
T354	5894-5928	Sentence	denotes	Annotation tool for medical images
T355	5929-6054	Sentence	denotes	The server program of the annotation tool was deployed in a computer with large network bandwidth and abundant storage space.
T356	6055-6190	Sentence	denotes	The client program of the annotation tool was deployed in the office computer of the experts, who were given unique user IDs for login.
T357	6191-6351	Sentence	denotes	The interface of the client program had a built-in image viewer with a window size of 512 × 512 and an export tool for obtaining the annotations in text format.
T358	6352-6575	Sentence	denotes	Multiple drawing tools were provided to annotate the lesion area in the images, including a rectangle tool for drawing a bounding box around the target, a polygon tool for outlining the target, and a circle tool the target.
T359	6576-6646	Sentence	denotes	Multiple categories could be defined and assigned to the target areas.
T360	6647-6873	Sentence	denotes	All annotations were stored in a structured query language (SQL) database, and the export tool was used to export the annotations to two common file formats (comma-separated values (csv) and JavaScript object notation (json)).
T361	6874-6921	Sentence	denotes	The experts could share the annotation results.
T362	6922-7059	Sentence	denotes	Since the size of the X-data and the CT slice-data were identical, the annotations for both data were performed with the annotation tool.
T363	7060-7155	Sentence	denotes	Here we use one image slice of the CT-data as an example to demonstrate the annotation process.
T364	7156-7225	Sentence	denotes	In this study, two experts were asked to annotate the medical images.
T365	7226-7286	Sentence	denotes	The normal cases were reviewed and confirmed by the experts.
T366	7287-7381	Sentence	denotes	The abnormal cases, including the COVID-19 and influenza cases, were annotated by the experts.
T367	7382-7472	Sentence	denotes	Bounding boxes of the lesion areas in the images were annotated using the annotation tool.
T368	7473-7533	Sentence	denotes	In general, each case contained 2–5 slices with annotations.
T369	7534-7677	Sentence	denotes	The cases with the annotated slices were considered positive cases, and each case was assigned to a category (COVID-19 case or influenza case).
T370	7678-7743	Sentence	denotes	The pipeline of the annotation was shown in Supplementary Fig. 1.
T371	7745-7776	Sentence	denotes	Model architecture and training
T372	7777-8008	Sentence	denotes	In this study, we proposed a modular CNNCF to identify the COVID-19 cases in the medical images and a CNNRF to determine the relationships between the lesion areas in the medical images and the five clinical indicators of COVID-19.
T373	8009-8085	Sentence	denotes	Both proposed frameworks consisted of two units (ResBlock-A and ResBlock-B).
T374	8086-8188	Sentence	denotes	The CNNCF and CNNRF had unique units, namely the control gate block and regressor block, respectively.
T375	8189-8314	Sentence	denotes	Both frameworks were implemented using two NVIDIA GTX 1080TI graphics cards and the open-source PyTorch framework.ResBlock-A:
T376	8315-8335	Sentence	denotes	As discussed in ref.
T377	8336-8477	Sentence	denotes	57, the residual block is a CNN-based block that allows the CNN models to reuse features, thus accelerating the training speed of the models.
T378	8478-8638	Sentence	denotes	In this study, we developed a residual block (ResBlock-A) that utilized a skip-connection for retaining features in different layers in the forward propagation.
T379	8639-8851	Sentence	denotes	This block (Fig. 6a) consisted of a multiple-input multiple-output structure with two branches (an upper branch and a bottom branch), where input 1 and input 2 have the same size, but the values may be different.
T380	8852-8945	Sentence	denotes	In contrast, output 1 and output 2 had the same size, but output 1 did not have a ReLu layer.
T381	8946-9073	Sentence	denotes	The upper branch consisted of a max-pooling layer (Max-Pooling), a convolution layer (Conv 1 × 1), and a batch norm layer (BN).
T382	9074-9308	Sentence	denotes	The Max-Pooling had a kernel size of 3 × 3 and a stride of 2 to downsample the input 1 for retaining the features and ensuring the same size as the output layer before the element-wise add operation was conducted in the bottom branch.
T383	9309-9487	Sentence	denotes	The Conv 1 × 1 consisted of multiple 1 × 1 convolution kernels with the same number as that in the second convolution layer in the bottom branch to adjust the number of channels.
T384	9488-9637	Sentence	denotes	The BN used a regulation function to ensure the input in each layer of the model followed a normal distribution with a mean of 0 and a variance of 1.
T385	9638-9728	Sentence	denotes	The bottom branch consisted of two convolution layers, two BN layers, and two ReLu layers.
T386	9729-9937	Sentence	denotes	The first convolution layer in the bottom branch consisted of multiple 3 × 3 convolution kernels with a stride of 2 and a padding of 1 to reduce the size of the feature maps when local features were obtained.
T387	9938-10074	Sentence	denotes	The second convolution layer in the bottom branch consisted of multiple 3 × 3 convolution kernels with a stride of 1 and a padding of 1.
T388	10075-10194	Sentence	denotes	The ReLu function was used as the activation function to ensure a non-linear relationship between the different layers.
T389	10195-10329	Sentence	denotes	The output of the upper branch and the output of the bottom branch after the second BN were fused using an element-wise add operation.
T390	10330-10416	Sentence	denotes	The fused result was output 1, and the fused result after the ReLu layer was output 2.
T391	10417-10465	Sentence	denotes	Fig. 6 The four units of the proposed framework.
T392	10466-11146	Sentence	denotes	a ResBlock-A architecture, containing two convolution layers with 3 × 3 kernels, one convolution layer with a 1 × 1 kernel, three batch normalization layers, two ReLu layers, and one max-pooling layer with a 3 × 3 kernel. b ResBlock-B architecture; the basic unit is the same as the ResBlock-A, except for output 1. c The Control Gate Block has a synaptic-based frontend architecture that controls the direction of the feature map flow and the overall optimization direction of the framework. d The Regressor architecture is a skip-connection architecture containing one convolution layer with 3 × 3 kernels, one batch normalization layer, one ReLu layer, and three linear layers.
T393	11147-11158	Sentence	denotes	ResBlock-B:
T394	11159-11295	Sentence	denotes	The ResBlock-B (Fig. 6b) was a multiple-input single-output block that was similar to the ResBlock-A, except that there was no output 1.
T395	11296-11446	Sentence	denotes	The value of the stride and padding in each layer of the ResBlock-A and ResBlock-B could be adjusted using hyper-parameters based on the requirements.
T396	11447-11466	Sentence	denotes	Control Gate Block:
T397	11467-11720	Sentence	denotes	As shown in Fig. 6c, the Control Gate Block was a multiple-input single-output block consisting of a predictor module, a counter module, and a synapses module to control the optimization direction while controlling the information flow in the framework.
T398	11721-11845	Sentence	denotes	The pipeline of the predictor module is shown in Supplementary Fig. 19a, where the Input S1 is the output of the ResBlock-B.
T399	11846-11947	Sentence	denotes	The Input S1 was then flattened to a one-dimensional feature vector as the input of the linear layer.
T400	11948-12054	Sentence	denotes	The output of the linear layer was converted to a probability of each category using the softmax function.
T401	12055-12203	Sentence	denotes	A sensitivity calculator used the Vpred and Vtrue as inputs to calculate the TP, TN, FP, and false-negative (FN) rates to calculate the sensitivity.
T402	12204-12303	Sentence	denotes	The sensitivity calculation was followed by a step function to control the output of the predictor.
T403	12304-12450	Sentence	denotes	The ths was a threshold value; if the calculated sensitivity was greater or equal to ths, the step function output 1; otherwise, the output was 0.
T404	12451-12532	Sentence	denotes	The counter module was a conditional counter, as shown in Supplementary Fig. 19b.
T405	12533-12598	Sentence	denotes	If the input n was zero, the counter was cleared and set to zero.
T406	12599-12637	Sentence	denotes	Otherwise, the counter increased by 1.
T407	12638-12672	Sentence	denotes	The output of the counter was num.
T408	12673-12822	Sentence	denotes	The synapses block mimicked the synaptic structure, and the input variable num was similar to a neurotransmitter, as shown in Supplementary Fig. 19c.
T409	12823-12882	Sentence	denotes	The input num was the input parameter of the step function.
T410	12883-13011	Sentence	denotes	The ths was a threshold value; if the input num was greater or equal to ths, the step function output 1; otherwise, it output 0.
T411	13012-13115	Sentence	denotes	An element-wise multiplication was performed between the input S1 and the output of the synapses block.
T412	13116-13171	Sentence	denotes	The multiplied result was passed on to a discriminator.
T413	13172-13272	Sentence	denotes	If the sum of each element in the result was not zero, the Input S1 was passed on to the next layer.
T414	13273-13327	Sentence	denotes	Otherwise, the input S1 information was not passed on.
T415	13328-13344	Sentence	denotes	Regressor block:
T416	13345-13473	Sentence	denotes	The regressor block consisted of multiple linear layers, a convolution layer, a BN layer, and a ReLu layer, as shown in Fig. 6d.
T417	13474-13616	Sentence	denotes	A skip-connection architecture was adopted to retain the features and increase the ability of the block to represent non-linear relationships.
T418	13617-13747	Sentence	denotes	The convolution block in the skip-connection structure was a convolution layer with multiple numbers of 1 × 1 convolution kernels.
T419	13748-13903	Sentence	denotes	The number of the convolution kernels was the same as that of the output size of the second linear layer to ensure the consistency of the vector dimension.
T420	13904-14005	Sentence	denotes	The input size and output size of each linear layer were adjustable to be applicable to actual cases.
T421	14006-14148	Sentence	denotes	Based on the four blocks, two frameworks were designed for the classification task and regression task, respectively.Classification framework:
T422	14149-14214	Sentence	denotes	The CNNCF consisted of stage I and stage II, as shown in Fig. 3a.
T423	14215-14286	Sentence	denotes	Stage I was duplicated Q times in the framework (in this study, Q = 1).
T424	14287-14409	Sentence	denotes	It consisted of multiple ResBlock-A with a number of M (in this study, M = 2), one ResBlock-B, and one Control Gate Block.
T425	14410-14513	Sentence	denotes	Stage II consisted of multiple ResBlock-A with a number of N (in this study, N = 2) and one ResBlock-B.
T426	14514-14660	Sentence	denotes	The weighted cross-entropy loss function was used and was minimized using the SGD optimizer with a learning rate of a1 (in this study, a1 = 0.01).
T427	14661-14911	Sentence	denotes	A warm-up strategy58 was used in the initialization of the learning rate for a smooth training start, and a reduction factor of b1 (in this study, b1 = 0.1) was used to reduce the learning rate after every c1 (in this study, c1 = 10) training epochs.
T428	14912-15050	Sentence	denotes	The model was trained for d1 (in this study, d1 = 40) epochs, and the model parameters saved in the last epoch was used in the test phase.
T429	15051-15072	Sentence	denotes	Regression framework:
T430	15073-15145	Sentence	denotes	The CNNRF (Fig. 3b) consisted of two parts (stage II and the regressor).
T431	15146-15390	Sentence	denotes	The inputs to the regression framework were the images of the lesion areas, and the output was the corresponding vector with five dimensions, representing the five clinical indicators (all clinical indicators were normalized to a range of 0–1).
T432	15391-15495	Sentence	denotes	The stage II structure was the same as that in the classification framework, except for some parameters.
T433	15496-15639	Sentence	denotes	The loss function was the MSE loss function, which was minimized using the SGD optimizer with a learning rate of a2 (in this study, a2 = 0.01).
T434	15640-15888	Sentence	denotes	A warm-up strategy was used in the initialization of the learning rate for a smooth training start, and a reduction factor of b2 (in this study, b2 = 0.1) was used to reduce the learning rate after every c2 (in this study, c2 = 50) training epochs.
T435	15889-16033	Sentence	denotes	The framework was trained for d2 (in this study, d2 = 200) epochs, and the model parameters saved in the last epoch were used in the test phase.
T436	16034-16079	Sentence	denotes	The workflow of the classification framework.
T437	16080-16153	Sentence	denotes	The workflow of the classification framework was demonstrated in Fig. 3c.
T438	16154-16282	Sentence	denotes	The preprocessed images are sent to the first convolution block to expand the channels and processed as the input for the CNNCF.
T439	16283-16425	Sentence	denotes	Given the input Fi with a size of M × N × 64, the stage I output feature maps F′i with a size of M/8 × N/8 × 256 in the default configuration.
T440	16426-16565	Sentence	denotes	As we introduced above, the Control Gate Block controls the optimization direction while controlling the information flow in the framework.
T441	16566-16648	Sentence	denotes	If the Control Gate Block is open, the feature maps F′i are passed on to stage II.
T442	16649-16954	Sentence	denotes	Given the input F′i, the stage II output the feature maps F″i with a size of M/64 × N/64 × 512 which is defined as follows:1 Fi′=S1(Fi)Fi″=S2(Fi′)⊗CGB(Fi′),where S1 denotes the stage I block, S2 denotes the stage II block, and CGB is the Control Gate Block. ⊗ is the element-wise multiplication operation.
T443	16955-17111	Sentence	denotes	Stage II is Followed by a global average pooling layer (GAP) and a fully connect layer (FC layer) with a softmax function to generate the final predictions.
T444	17112-17202	Sentence	denotes	Given F″i as input, the GAP is adopted to generate a vector Vf with a size of 1 × 1 × 512.
T445	17203-17570	Sentence	denotes	Given Vf as input, the FC layer with the softmax function outputs a vector Vc with a size of 1 × 1 × C.2 Vf=GAPFi′Vc=SMaxFCVf,where GAP is the global average pooling layer, the FC is the fully connect layer, SMax is the softmax function, Vf is the feature vector generated by the GAP, Vc is the prediction vector, and C is the number of case types used in this study.
T446	17572-17649	Sentence	denotes	Training strategies and evaluation indicators of the classification framework
T447	17650-17743	Sentence	denotes	The training strategies and hyper-parameters of the classification framework were as follows.
T448	17744-17922	Sentence	denotes	We adopted a knowledge distillation method (Fig. 7) to train the CNNCF as a student network with one stage I block and one stage II block, each of which contained two ResBlock-A.
T449	17923-18127	Sentence	denotes	Four teacher networks (the hyper-parameters are provided in Supplementary Table 21) with the proposed blocks were trained on the train-val part of each sub-data set using a 5-fold cross-validation method.
T450	18128-18197	Sentence	denotes	All networks were initialized using the Xavier initialization method.
T451	18198-18276	Sentence	denotes	The initial learning rate was 0.01, and the optimization function was the SGD.
T452	18277-18387	Sentence	denotes	The CNNCF was trained using the image data and the label, as well as the fused output of the teacher networks.
T453	18388-18510	Sentence	denotes	The comparison of RT-PCR test results using throat specimen and the CNNCF results were provided in Supplementary Table 22.
T454	18511-18588	Sentence	denotes	Supplementary Fig. 20 shows the details of the knowledge distillation method.
T455	18589-18705	Sentence	denotes	The definitions and details of the five evaluation indicators used in this study were given in Supplementary Note 2.
T456	18706-18805	Sentence	denotes	Fig. 7 Knowledge distillation consisting of multiple teacher networks and a target student network.
T457	18806-18906	Sentence	denotes	The knowledge is transferred from the teacher networks to the student network using a loss function.
T458	18908-18947	Sentence	denotes	Gradient-weighted class activation maps
T459	18948-19091	Sentence	denotes	Grad-CAM59 in the Pytorch framework was used to visualize the salient features that contributed the most to the prediction output of the model.
T460	19092-19338	Sentence	denotes	Given a target category, the Grad-CAM performed back-propagation to obtain the final CNN feature maps and the gradient of the feature maps; only pixels with positive contributions to the specified category were retained through the ReLU function.
T461	19339-19538	Sentence	denotes	The Grad-CAM method was used for all test data set (X-data and CT-data) in the CNNCF without changing the framework structure to obtain a visual output of the framework’s high discriminatory ability.
T462	19540-19570	Sentence	denotes	Statistics and reproducibility
T463	19571-19689	Sentence	denotes	We used multiple statistical indices and empirical distributions to assess the performance of the proposed frameworks.
T464	19690-19849	Sentence	denotes	The equations of the statistical indices are shown in Supplementary Fig. 21 and all the abbreviations used in this study are defined in Supplementary Table 23.
T465	19850-19978	Sentence	denotes	All the data used in this study followed the criteria: (1) sign informed consent prior to enrollment. (2) At least 18 years old.
T466	19979-20110	Sentence	denotes	This study was conducted following the declaration of Helsinki and was approved by the Capital Medical University Ethics Committee.
T467	20111-20312	Sentence	denotes	The following statistical analyses of the data were conducted for both evaluating the classification framework and the regression framework.Statistical indices to evaluate the classification framework.
T468	20313-20540	Sentence	denotes	Multiple evaluation indicators (PRC, ROC, AUPRC, AUROC, sensitivity, specificity, precision, kappa index, and F1 with a fixed threshold) were computed for a comprehensive and accurate assessment of the classification framework.
T469	20541-20657	Sentence	denotes	Multiple threshold values were in the range from 0 to 1 with a step value of 0.005 to obtain the ROC and PRC curves.
T470	20658-20824	Sentence	denotes	The PRC showed the relationship between the precision and the sensitivity (or recall), and the ROC indicated the relationship between the sensitivity and specificity.
T471	20825-20912	Sentence	denotes	The two curves reflected the comprehensive performance of the classification framework.
T472	20913-21017	Sentence	denotes	The kappa index is a statistical method for assessing the degree of agreement between different methods.
T473	21018-21097	Sentence	denotes	In our use case, the indicator was used to measure the stability of the method.
T474	21098-21190	Sentence	denotes	The F1 score is a harmonic average of precision and sensitivity and considers the FP and FN.
T475	21191-21283	Sentence	denotes	The bootstrapping method was used to calculate the empirical distribution of each indicator.
T476	21284-21477	Sentence	denotes	The detailed calculation process was as follows: we conducted random sampling with replacement to generate 1000 new test data sets with the same number of samples as the original test data set.
T477	21478-21551	Sentence	denotes	The evaluation indicators were calculated to determine the distributions.
T478	21552-21625	Sentence	denotes	The results were displayed in boxplots (Fig. 5 and Supplementary Fig. 2).
T479	21626-21683	Sentence	denotes	Statistical indices to evaluate the regression framework.
T480	21684-21831	Sentence	denotes	Multiple evaluation indicators (MSE, RMSE, MAE, R2, and PCC) were computed for a comprehensive and accurate assessment of the regression framework.
T481	21832-21914	Sentence	denotes	The MSE was used to calculate the deviation between the predicted and true values.
T482	21915-21962	Sentence	denotes	The RMSE was the square root of the MSE result.
T483	21963-22024	Sentence	denotes	The two indicators show the accuracy of the model prediction.
T484	22025-22099	Sentence	denotes	The R2 was used to assess the goodness-of-fit of the regression framework.
T485	22100-22191	Sentence	denotes	The r was used to assess the correlation between two variables in the regression framework.
T486	22192-22286	Sentence	denotes	The indicators were calculated using the open-source tools scikit-learn and the scipy library.

T312

0-7

Sentence

denotes

Methods

T313

9-28

Sentence

denotes

Data sets splitting

T314

29-205

Sentence

denotes

We used the multi-modal data sets from four public data sets and one hospital (Youan hospital) in our research and split the hybrid data set in the following manner.For X-data:

T315

206-348

Sentence

denotes

The CXR images of COVID-19 cases collected from the public CCD52 contained 212 patients diagnosed with COVID-19 and were resized to 512 × 512.

T316

349-422

Sentence

denotes

Each image contained 1–2 suspected areas with inflammatory lesions (SAs).

T317

423-522

Sentence

denotes

We also collected 5100 normal cases and 3100 pneumonia cases from another public data set (RSNA)53.

T318

523-744

Sentence

denotes

In addition, The CXR images collected from the Youan hospital contained 45 cases diagnosed with COVID-19, 503 normal cases, 435 cases diagnosed with pneumonia (not COVID-19 patients), and 145 cases diagnosed as influenza.

T319

745-851

Sentence

denotes

The CXR images collected from the Youan hospital were obtained using the Carestream DRX-Revolution system.

T320

852-969

Sentence

denotes

All the CXR images of COVID-19 cases were analyzed by the two experienced radiologists to determine the lesion areas.

T321

970-1149

Sentence

denotes

The X-data of the normal cases (XNPDS), that of the pneumonia cases (XPPDS), and that of the COVID-19 cases (XCPDS) from public data sets constituted the X public data set (XPDS).

T322

1150-1333

Sentence

denotes

The X-data of the normal cases (XNHDS), that of the pneumonia cases (XPHDS), and that of the COVID-19 cases (XCHDS) from the Youan hospital constituted the X hospital data set (XHDS).

T323

1334-1346

Sentence

denotes

For CT-data:

T324

1347-1829

Sentence

denotes

We collected CT-data of 120 normal cases from a public lung CT-data set (LUNA16, a large data set for automatic nodule detection in the lungs54), which was a subset of LIDC-IDRI (The LIDC-IDRI contains a total of 1018 helical thoracic CT scans collected using manufacturers from eight medical imaging companies including AGFA Healthcare, Carestream Health, Inc., Fuji Photo Film Co., GE Healthcare, iCAD, Inc., Philips Healthcare, Riverain Medical, and Siemens Medical Solutions)55.

T325

1830-1995

Sentence

denotes

It was confirmed by the two experienced radiologists from the Youan Hospital that no lesion areas of COVID-19, pneumonia, or influenza were present in the 120 cases.

T326

1996-2138

Sentence

denotes

We also collected the CT-data of pneumonia cases from a public data set (images of COVID-19 positive and negative pneumonia patients: ICNP)56.

T327

2139-2311

Sentence

denotes

The CT-data collected from the Youan hospital contained 95 patients diagnosed with COVID-19, 50 patients diagnosed with influenza and 215 patients diagnosed with pneumonia.

T328

2312-2480

Sentence

denotes

The images of the CT scans collected from the Youan hospital were obtained using the PHILIPS Brilliance iCT 256 system (Which was also used for the LIDC-IDRI data set).

T329

2481-2594

Sentence

denotes

The slice thickness of the CT scans was 5 mm, and the CT-data images were grayscale images with 512 × 512 pixels.

T330

2595-2785

Sentence

denotes

Areas with 2–5 SAs were annotated by the two experienced radiologists using a rapid keystroke-entry format in the images for each case, and these areas ranged from 16 × 16 to 64 × 64 pixels.

T331

2786-2937

Sentence

denotes

The CT-data of the normal cases (CTNPDS) and that of the pneumonia cases (CTPPDS) from the public data sets constituted the CT public data set (CTPDS).

T332

2938-3182

Sentence

denotes

The CT-data of the COVID-19 cases from the Youan hospital (CTCHDS), the influenza cases from the Youan hospital (CTIHDS), and the normal cases from the Youan hospital (CTNHDS) constituted the CT hospital (clinically-diagnosed) data set (CTHDS).

T333

3183-3211

Sentence

denotes

For clinical indicator data:

T334

3212-3438

Sentence

denotes

Five clinical indicators (white blood cell count, neutrophil percentage, lymphocyte percentage, procalcitonin, C-reactive protein) of 95 COVID-19 cases were obtained from the Youan hospital, as shown in Supplementary Table 20.

T335

3439-3695

Sentence

denotes

A total of 95 data pairs from the 95 COVID-19 cases (369 images of the lesion area and the 95 × 5 clinical indicators) were collected from the Youan hospital for the correlation analysis of the lesion areas of the COVID-19 and the five clinical indicators.

T336

3696-3803

Sentence

denotes

The images of the SAs and the clinical indicator data constituted the correlation analysis data set (CADS).

T337

3804-3923

Sentence

denotes

We split the XPDS, XHDS, CTPDS, CTHDS, and CADS into the training-validation (train-val) and test data sets using TTSF.

T338

3924-4030

Sentence

denotes

The details of the hybrid data sets for the public data sets and Youan hospital data are shown in Table 1.

T339

4031-4118

Sentence

denotes

The train-val part of CTHDS is referred to as CTHTS, and the test part is called CTHVS.

T340

4119-4260

Sentence

denotes

The same naming scheme was adopted for XPDS, XHDS, CTPDS, and CADS, i.e., XPTS, XPVS, XHTS, XHVS, CTPTS, CTPVS, CATS, and CAVS, respectively.

T341

4261-4445

Sentence

denotes

The training-validation part of the four public data sets and the hospital (Youan Hospital) data set were mixed for X-data and CT-data, which were named as XMTS and CTMTS respectively.

T342

4446-4519

Sentence

denotes

While the test parts were split in the same way and named XMVS and CTMVS.

T343

4521-4540

Sentence

denotes

Image preprocessing

T344

4541-4723

Sentence

denotes

All image data (X-data and CT-data) in the DICOM format were loaded using the Pydicom library (version 1.4.0) and processed as arrays using the Numpy library (version 1.16.0).X-data:

T345

4724-4907

Sentence

denotes

The two-dimensional array (x axis and y axis) of the image of the X-data (size of 512 × 512) was normalized to pixel values of 0–255 and stored in png format using the OpenCV library.

T346

4908-4976

Sentence

denotes

Each preprocessed image was resized to 512 × 512 and had 3 channels.

T347

4977-4985

Sentence

denotes

CT-data:

T348

4986-5147

Sentence

denotes

The array of the CT-data was three-dimensional (x axis, y axis, and z axis), and the length of the z axis was ~300, which represented the number of image slices.

T349

5148-5224

Sentence

denotes

Each image slice was two-dimensional (x axis and y axis, size of 512 × 512).

T350

5225-5412

Sentence

denotes

As shown in Fig. 1b, the array of the image was divided into three groups in the z axis direction, and each group contained 100 image slices (each case was resampled to 300 image slices).

T351

5413-5543

Sentence

denotes

The image slices in each group were processed using a window center of −600 and a window width of 2000 to extract the lung tissue.

T352

5544-5682

Sentence

denotes

The images of the CT-data with 300 image slices were normalized to pixel values of 0–255 and stored in npy format using the Numpy library.

T353

5683-5892

Sentence

denotes

A convolution filter was applied with three 1 × 1 convolution kernels to preprocess the CT-data, which is a trainable layer with the aim of normalizing the input; the image size was 512 × 512, with 3 channels.

T354

5894-5928

Sentence

denotes

Annotation tool for medical images

T355

5929-6054

Sentence

denotes

The server program of the annotation tool was deployed in a computer with large network bandwidth and abundant storage space.

T356

6055-6190

Sentence

denotes

The client program of the annotation tool was deployed in the office computer of the experts, who were given unique user IDs for login.

T357

6191-6351

Sentence

denotes

The interface of the client program had a built-in image viewer with a window size of 512 × 512 and an export tool for obtaining the annotations in text format.

T358

6352-6575

Sentence

denotes

Multiple drawing tools were provided to annotate the lesion area in the images, including a rectangle tool for drawing a bounding box around the target, a polygon tool for outlining the target, and a circle tool the target.

T359

6576-6646

Sentence

denotes

Multiple categories could be defined and assigned to the target areas.

T360

6647-6873

Sentence

denotes

All annotations were stored in a structured query language (SQL) database, and the export tool was used to export the annotations to two common file formats (comma-separated values (csv) and JavaScript object notation (json)).

T361

6874-6921

Sentence

denotes

The experts could share the annotation results.

T362

6922-7059

Sentence

denotes

Since the size of the X-data and the CT slice-data were identical, the annotations for both data were performed with the annotation tool.

T363

7060-7155

Sentence

denotes

Here we use one image slice of the CT-data as an example to demonstrate the annotation process.

T364

7156-7225

Sentence

denotes

In this study, two experts were asked to annotate the medical images.

T365

7226-7286

Sentence

denotes

The normal cases were reviewed and confirmed by the experts.

T366

7287-7381

Sentence

denotes

The abnormal cases, including the COVID-19 and influenza cases, were annotated by the experts.

T367

7382-7472

Sentence

denotes

Bounding boxes of the lesion areas in the images were annotated using the annotation tool.

T368

7473-7533

Sentence

denotes

In general, each case contained 2–5 slices with annotations.

T369

7534-7677

Sentence

denotes

The cases with the annotated slices were considered positive cases, and each case was assigned to a category (COVID-19 case or influenza case).

T370

7678-7743

Sentence

denotes

The pipeline of the annotation was shown in Supplementary Fig. 1.

T371

7745-7776

Sentence

denotes

Model architecture and training

T372

7777-8008

Sentence

denotes

In this study, we proposed a modular CNNCF to identify the COVID-19 cases in the medical images and a CNNRF to determine the relationships between the lesion areas in the medical images and the five clinical indicators of COVID-19.

T373

8009-8085

Sentence

denotes

Both proposed frameworks consisted of two units (ResBlock-A and ResBlock-B).

T374

8086-8188

Sentence

denotes

The CNNCF and CNNRF had unique units, namely the control gate block and regressor block, respectively.

T375

8189-8314

Sentence

denotes

Both frameworks were implemented using two NVIDIA GTX 1080TI graphics cards and the open-source PyTorch framework.ResBlock-A:

T376

8315-8335

Sentence

denotes

As discussed in ref.

T377

8336-8477

Sentence

denotes

57, the residual block is a CNN-based block that allows the CNN models to reuse features, thus accelerating the training speed of the models.

T378

8478-8638

Sentence

denotes

In this study, we developed a residual block (ResBlock-A) that utilized a skip-connection for retaining features in different layers in the forward propagation.

T379

8639-8851

Sentence

denotes

This block (Fig. 6a) consisted of a multiple-input multiple-output structure with two branches (an upper branch and a bottom branch), where input 1 and input 2 have the same size, but the values may be different.

T380

8852-8945

Sentence

denotes

In contrast, output 1 and output 2 had the same size, but output 1 did not have a ReLu layer.

T381

8946-9073

Sentence

denotes

The upper branch consisted of a max-pooling layer (Max-Pooling), a convolution layer (Conv 1 × 1), and a batch norm layer (BN).

T382

9074-9308

Sentence

denotes

The Max-Pooling had a kernel size of 3 × 3 and a stride of 2 to downsample the input 1 for retaining the features and ensuring the same size as the output layer before the element-wise add operation was conducted in the bottom branch.

T383

9309-9487

Sentence

denotes

The Conv 1 × 1 consisted of multiple 1 × 1 convolution kernels with the same number as that in the second convolution layer in the bottom branch to adjust the number of channels.

T384

9488-9637

Sentence

denotes

The BN used a regulation function to ensure the input in each layer of the model followed a normal distribution with a mean of 0 and a variance of 1.

T385

9638-9728

Sentence

denotes

The bottom branch consisted of two convolution layers, two BN layers, and two ReLu layers.

T386

9729-9937

Sentence

denotes

The first convolution layer in the bottom branch consisted of multiple 3 × 3 convolution kernels with a stride of 2 and a padding of 1 to reduce the size of the feature maps when local features were obtained.

T387

9938-10074

Sentence

denotes

The second convolution layer in the bottom branch consisted of multiple 3 × 3 convolution kernels with a stride of 1 and a padding of 1.

T388

10075-10194

Sentence

denotes

The ReLu function was used as the activation function to ensure a non-linear relationship between the different layers.

T389

10195-10329

Sentence

denotes

The output of the upper branch and the output of the bottom branch after the second BN were fused using an element-wise add operation.

T390

10330-10416

Sentence

denotes

The fused result was output 1, and the fused result after the ReLu layer was output 2.

T391

10417-10465

Sentence

denotes

Fig. 6 The four units of the proposed framework.

T392

10466-11146

Sentence

denotes

a ResBlock-A architecture, containing two convolution layers with 3 × 3 kernels, one convolution layer with a 1 × 1 kernel, three batch normalization layers, two ReLu layers, and one max-pooling layer with a 3 × 3 kernel. b ResBlock-B architecture; the basic unit is the same as the ResBlock-A, except for output 1. c The Control Gate Block has a synaptic-based frontend architecture that controls the direction of the feature map flow and the overall optimization direction of the framework. d The Regressor architecture is a skip-connection architecture containing one convolution layer with 3 × 3 kernels, one batch normalization layer, one ReLu layer, and three linear layers.

T393

11147-11158

Sentence

denotes

ResBlock-B:

T394

11159-11295

Sentence

denotes

The ResBlock-B (Fig. 6b) was a multiple-input single-output block that was similar to the ResBlock-A, except that there was no output 1.

T395

11296-11446

Sentence

denotes

The value of the stride and padding in each layer of the ResBlock-A and ResBlock-B could be adjusted using hyper-parameters based on the requirements.

T396

11447-11466

Sentence

denotes

Control Gate Block:

T397

11467-11720

Sentence

denotes

As shown in Fig. 6c, the Control Gate Block was a multiple-input single-output block consisting of a predictor module, a counter module, and a synapses module to control the optimization direction while controlling the information flow in the framework.

T398

11721-11845

Sentence

denotes

The pipeline of the predictor module is shown in Supplementary Fig. 19a, where the Input S1 is the output of the ResBlock-B.

T399

11846-11947

Sentence

denotes

The Input S1 was then flattened to a one-dimensional feature vector as the input of the linear layer.

T400

11948-12054

Sentence

denotes

The output of the linear layer was converted to a probability of each category using the softmax function.

T401

12055-12203

Sentence

denotes

A sensitivity calculator used the Vpred and Vtrue as inputs to calculate the TP, TN, FP, and false-negative (FN) rates to calculate the sensitivity.

T402

12204-12303

Sentence

denotes

The sensitivity calculation was followed by a step function to control the output of the predictor.

T403

12304-12450

Sentence

denotes

The ths was a threshold value; if the calculated sensitivity was greater or equal to ths, the step function output 1; otherwise, the output was 0.

T404

12451-12532

Sentence

denotes

The counter module was a conditional counter, as shown in Supplementary Fig. 19b.

T405

12533-12598

Sentence

denotes

If the input n was zero, the counter was cleared and set to zero.

T406

12599-12637

Sentence

denotes

Otherwise, the counter increased by 1.

T407

12638-12672

Sentence

denotes

The output of the counter was num.

T408

12673-12822

Sentence

denotes

The synapses block mimicked the synaptic structure, and the input variable num was similar to a neurotransmitter, as shown in Supplementary Fig. 19c.

T409

12823-12882

Sentence

denotes

The input num was the input parameter of the step function.

T410

12883-13011

Sentence

denotes

The ths was a threshold value; if the input num was greater or equal to ths, the step function output 1; otherwise, it output 0.

T411

13012-13115

Sentence

denotes

An element-wise multiplication was performed between the input S1 and the output of the synapses block.

T412

13116-13171

Sentence

denotes

The multiplied result was passed on to a discriminator.

T413

13172-13272

Sentence

denotes

If the sum of each element in the result was not zero, the Input S1 was passed on to the next layer.

T414

13273-13327

Sentence

denotes

Otherwise, the input S1 information was not passed on.

T415

13328-13344

Sentence

denotes

Regressor block:

T416

13345-13473

Sentence

denotes

The regressor block consisted of multiple linear layers, a convolution layer, a BN layer, and a ReLu layer, as shown in Fig. 6d.

T417

13474-13616

Sentence

denotes

A skip-connection architecture was adopted to retain the features and increase the ability of the block to represent non-linear relationships.

T418

13617-13747

Sentence

denotes

The convolution block in the skip-connection structure was a convolution layer with multiple numbers of 1 × 1 convolution kernels.

T419

13748-13903

Sentence

denotes

The number of the convolution kernels was the same as that of the output size of the second linear layer to ensure the consistency of the vector dimension.

T420

13904-14005

Sentence

denotes

The input size and output size of each linear layer were adjustable to be applicable to actual cases.

T421

14006-14148

Sentence

denotes

Based on the four blocks, two frameworks were designed for the classification task and regression task, respectively.Classification framework:

T422

14149-14214

Sentence

denotes

The CNNCF consisted of stage I and stage II, as shown in Fig. 3a.

T423

14215-14286

Sentence

denotes

Stage I was duplicated Q times in the framework (in this study, Q = 1).

T424

14287-14409

Sentence

denotes

It consisted of multiple ResBlock-A with a number of M (in this study, M = 2), one ResBlock-B, and one Control Gate Block.

T425

14410-14513

Sentence

denotes

Stage II consisted of multiple ResBlock-A with a number of N (in this study, N = 2) and one ResBlock-B.

T426

14514-14660

Sentence

denotes

The weighted cross-entropy loss function was used and was minimized using the SGD optimizer with a learning rate of a1 (in this study, a1 = 0.01).

T427

14661-14911

Sentence

denotes

A warm-up strategy58 was used in the initialization of the learning rate for a smooth training start, and a reduction factor of b1 (in this study, b1 = 0.1) was used to reduce the learning rate after every c1 (in this study, c1 = 10) training epochs.

T428

14912-15050

Sentence

denotes

The model was trained for d1 (in this study, d1 = 40) epochs, and the model parameters saved in the last epoch was used in the test phase.

T429

15051-15072

Sentence

denotes

Regression framework:

T430

15073-15145

Sentence

denotes

The CNNRF (Fig. 3b) consisted of two parts (stage II and the regressor).

T431

15146-15390

Sentence

denotes

The inputs to the regression framework were the images of the lesion areas, and the output was the corresponding vector with five dimensions, representing the five clinical indicators (all clinical indicators were normalized to a range of 0–1).

T432

15391-15495

Sentence

denotes

The stage II structure was the same as that in the classification framework, except for some parameters.

T433

15496-15639

Sentence

denotes

The loss function was the MSE loss function, which was minimized using the SGD optimizer with a learning rate of a2 (in this study, a2 = 0.01).

T434

15640-15888

Sentence

denotes

A warm-up strategy was used in the initialization of the learning rate for a smooth training start, and a reduction factor of b2 (in this study, b2 = 0.1) was used to reduce the learning rate after every c2 (in this study, c2 = 50) training epochs.

T435

15889-16033

Sentence

denotes

The framework was trained for d2 (in this study, d2 = 200) epochs, and the model parameters saved in the last epoch were used in the test phase.

T436

16034-16079

Sentence

denotes

The workflow of the classification framework.

T437

16080-16153

Sentence

denotes

The workflow of the classification framework was demonstrated in Fig. 3c.

T438

16154-16282

Sentence

denotes

The preprocessed images are sent to the first convolution block to expand the channels and processed as the input for the CNNCF.

T439

16283-16425

Sentence

denotes

Given the input Fi with a size of M × N × 64, the stage I output feature maps F′i with a size of M/8 × N/8 × 256 in the default configuration.

T440

16426-16565

Sentence

denotes

As we introduced above, the Control Gate Block controls the optimization direction while controlling the information flow in the framework.

T441

16566-16648

Sentence

denotes

If the Control Gate Block is open, the feature maps F′i are passed on to stage II.

T442

16649-16954

Sentence

denotes

Given the input F′i, the stage II output the feature maps F″i with a size of M/64 × N/64 × 512 which is defined as follows:1 Fi′=S1(Fi)Fi″=S2(Fi′)⊗CGB(Fi′),where S1 denotes the stage I block, S2 denotes the stage II block, and CGB is the Control Gate Block. ⊗ is the element-wise multiplication operation.

T443

16955-17111

Sentence

denotes

Stage II is Followed by a global average pooling layer (GAP) and a fully connect layer (FC layer) with a softmax function to generate the final predictions.

T444

17112-17202

Sentence

denotes

Given F″i as input, the GAP is adopted to generate a vector Vf with a size of 1 × 1 × 512.

T445

17203-17570

Sentence

denotes

Given Vf as input, the FC layer with the softmax function outputs a vector Vc with a size of 1 × 1 × C.2 Vf=GAPFi′Vc=SMaxFCVf,where GAP is the global average pooling layer, the FC is the fully connect layer, SMax is the softmax function, Vf is the feature vector generated by the GAP, Vc is the prediction vector, and C is the number of case types used in this study.

T446

17572-17649

Sentence

denotes

Training strategies and evaluation indicators of the classification framework

T447

17650-17743

Sentence

denotes

The training strategies and hyper-parameters of the classification framework were as follows.

T448

17744-17922

Sentence

denotes

We adopted a knowledge distillation method (Fig. 7) to train the CNNCF as a student network with one stage I block and one stage II block, each of which contained two ResBlock-A.

T449

17923-18127

Sentence

denotes

Four teacher networks (the hyper-parameters are provided in Supplementary Table 21) with the proposed blocks were trained on the train-val part of each sub-data set using a 5-fold cross-validation method.

T450

18128-18197

Sentence

denotes

All networks were initialized using the Xavier initialization method.

T451

18198-18276

Sentence

denotes

The initial learning rate was 0.01, and the optimization function was the SGD.

T452

18277-18387

Sentence

denotes

The CNNCF was trained using the image data and the label, as well as the fused output of the teacher networks.

T453

18388-18510

Sentence

denotes

The comparison of RT-PCR test results using throat specimen and the CNNCF results were provided in Supplementary Table 22.

T454

18511-18588

Sentence

denotes

Supplementary Fig. 20 shows the details of the knowledge distillation method.

T455

18589-18705

Sentence

denotes

The definitions and details of the five evaluation indicators used in this study were given in Supplementary Note 2.

T456

18706-18805

Sentence

denotes

Fig. 7 Knowledge distillation consisting of multiple teacher networks and a target student network.

T457

18806-18906

Sentence

denotes

The knowledge is transferred from the teacher networks to the student network using a loss function.

T458

18908-18947

Sentence

denotes

Gradient-weighted class activation maps

T459

18948-19091

Sentence

denotes

Grad-CAM59 in the Pytorch framework was used to visualize the salient features that contributed the most to the prediction output of the model.

T460

19092-19338

Sentence

denotes

Given a target category, the Grad-CAM performed back-propagation to obtain the final CNN feature maps and the gradient of the feature maps; only pixels with positive contributions to the specified category were retained through the ReLU function.

T461

19339-19538

Sentence

denotes

The Grad-CAM method was used for all test data set (X-data and CT-data) in the CNNCF without changing the framework structure to obtain a visual output of the framework’s high discriminatory ability.

T462

19540-19570

Sentence

denotes

Statistics and reproducibility

T463

19571-19689

Sentence

denotes

We used multiple statistical indices and empirical distributions to assess the performance of the proposed frameworks.

T464

19690-19849

Sentence

denotes

The equations of the statistical indices are shown in Supplementary Fig. 21 and all the abbreviations used in this study are defined in Supplementary Table 23.

T465

19850-19978

Sentence

denotes

All the data used in this study followed the criteria: (1) sign informed consent prior to enrollment. (2) At least 18 years old.

T466

19979-20110

Sentence

denotes

This study was conducted following the declaration of Helsinki and was approved by the Capital Medical University Ethics Committee.

T467

20111-20312

Sentence

denotes

The following statistical analyses of the data were conducted for both evaluating the classification framework and the regression framework.Statistical indices to evaluate the classification framework.

T468

20313-20540

Sentence

denotes

Multiple evaluation indicators (PRC, ROC, AUPRC, AUROC, sensitivity, specificity, precision, kappa index, and F1 with a fixed threshold) were computed for a comprehensive and accurate assessment of the classification framework.

T469

20541-20657

Sentence

denotes

Multiple threshold values were in the range from 0 to 1 with a step value of 0.005 to obtain the ROC and PRC curves.

T470

20658-20824

Sentence

denotes

The PRC showed the relationship between the precision and the sensitivity (or recall), and the ROC indicated the relationship between the sensitivity and specificity.

T471

20825-20912

Sentence

denotes

The two curves reflected the comprehensive performance of the classification framework.

T472

20913-21017

Sentence

denotes

The kappa index is a statistical method for assessing the degree of agreement between different methods.

T473

21018-21097

Sentence

denotes

In our use case, the indicator was used to measure the stability of the method.

T474

21098-21190

Sentence

denotes

The F1 score is a harmonic average of precision and sensitivity and considers the FP and FN.

T475

21191-21283

Sentence

denotes

The bootstrapping method was used to calculate the empirical distribution of each indicator.

T476

21284-21477

Sentence

denotes

The detailed calculation process was as follows: we conducted random sampling with replacement to generate 1000 new test data sets with the same number of samples as the original test data set.

T477

21478-21551

Sentence

denotes

The evaluation indicators were calculated to determine the distributions.

T478

21552-21625

Sentence

denotes

The results were displayed in boxplots (Fig. 5 and Supplementary Fig. 2).

T479

21626-21683

Sentence

denotes

Statistical indices to evaluate the regression framework.

T480

21684-21831

Sentence

denotes

Multiple evaluation indicators (MSE, RMSE, MAE, R2, and PCC) were computed for a comprehensive and accurate assessment of the regression framework.

T481

21832-21914

Sentence

denotes

The MSE was used to calculate the deviation between the predicted and true values.

T482

21915-21962

Sentence

denotes

The RMSE was the square root of the MSE result.

T483

21963-22024

Sentence

denotes

The two indicators show the accuracy of the model prediction.

T484

22025-22099

Sentence

denotes

The R2 was used to assess the goodness-of-fit of the regression framework.

T485

22100-22191

Sentence

denotes

The r was used to assess the correlation between two variables in the regression framework.

T486

22192-22286

Sentence

denotes

The indicators were calculated using the open-source tools scikit-learn and the scipy library.

PMC:7782580 / 42107-64393 JSON TXT 3 Projects

Annnotations TAB TSV DIC JSON TextAE

PMC:7782580 / 42107-64393 JSONTXT 3 Projects

Annnotations TAB TSV DIC JSON TextAE

PMC:7782580 / 42107-64393 JSON TXT 3 Projects