All image data (X-data and CT-data) in the DICOM format were loaded using the Pydicom library (version 1.4.0) and processed as arrays using the Numpy library (version 1.16.0).X-data: The two-dimensional array (x axis and y axis) of the image of the X-data (size of 512 × 512) was normalized to pixel values of 0–255 and stored in png format using the OpenCV library. Each preprocessed image was resized to 512 × 512 and had 3 channels. CT-data: The array of the CT-data was three-dimensional (x axis, y axis, and z axis), and the length of the z axis was ~300, which represented the number of image slices. Each image slice was two-dimensional (x axis and y axis, size of 512 × 512). As shown in Fig. 1b, the array of the image was divided into three groups in the z axis direction, and each group contained 100 image slices (each case was resampled to 300 image slices). The image slices in each group were processed using a window center of −600 and a window width of 2000 to extract the lung tissue. The images of the CT-data with 300 image slices were normalized to pixel values of 0–255 and stored in npy format using the Numpy library. A convolution filter was applied with three 1 × 1 convolution kernels to preprocess the CT-data, which is a trainable layer with the aim of normalizing the input; the image size was 512 × 512, with 3 channels.