Deep learning (DL) methods, especially convolutional neural networks (CNNs), are effective approaches for representation learning using multilayer neural networks22 and have provided excellent performance solutions to many problems in image classification23,24, object detection25, games and decisions26, and natural language processing27. A deep residual network28 is a type of CNN architecture that uses the strategy of skip connections to avoid degradation of models. However, the applications of DL for clinical diagnoses remains limited due to the lack of interpretability of the DL model and the multi-modal properties of clinical data. Some studies have demonstrated excellent performance of DL methods for the detection of lung cancer with CT images29, pneumonia with CXR images30, and diabetic retinopathy with retinal fundus photographs31. To the best of our knowledge, the DL method has been validated only on single modal data, and no correlation analysis with clinical indicators was performed. Traditional machine learning methods are more constrained and better suited than DL methods to specific, practical computing tasks using features32. As demonstrated by Jin et al., the traditional machine learning algorithm using the scale-invariant feature transform (SIFT)33 and random sample consensus (RANSAC)34 may outperform the state-of-the-art DL methods for image matching35. We designed a general end-to-end DL framework for information extraction from CXR images (X-data) and CT images (CT-data) that can be considered a cross-domain transfer learning model.