3.3. Class Prediction Analysis Class prediction analysis was used to construct multivariate predictors to classify arrays into pre-defined classes based on autoantibody profiles among these classes. The emphasis is to develop an accurate multivariate classifier, which is capable of predicting to which class a future sample belongs. Table 5 provides an overview of class prediction results performed using either the whole data set or only samples having the same histology and matched controls. Although DWD showed the lowest performance in class comparison analysis, this pre-processing method was also included for further class prediction besides quantile normalization and ComBat for evaluation of class prediction performance of all methods. Class prediction analysis of DWD-adjusted data resulted in lower accuracy for the distinct histological entities (79% to 92%) of lung cancer, except for AdCa (91%), compared to quantile normalized and ComBat-adjusted data. In the following, class prediction results of quantile normalized and ComBat-adjusted data are outlined. Different feature selection algorithms and best predictor models have been used to compare the performance of different settings. It can be seen that overall sensitivity and specificity values are higher than 80%. Correct classifications rates, representing the correctly classified samples as case or control using the calculated classifier panel are also comparable, ranging from 79% to 85% when applying class prediction on all lung cancer cases versus all controls. Using distinct histological subtypes versus matched controls, correct classification rates from 83% to 98% could be achieved. In general, class prediction performed only with data from one histological subtype yielded higher accuracy (0.83–0.98), sensitivity (0.80–1.00) and specificity (0.83–0.96) values than analyzing all lung cancer samples versus all controls with an accuracy of 79%–85%, sensitivity of 0.76–0.90, and specificity of 0.75–0.85. Furthermore, it can be seen that when performing class prediction analysis for SCLC or AdCa the quantile normalized data yielded a higher accuracy than ComBat data, whereas for LCLC and SqLC ComBat was favorable. Venn diagrams illustrated in Figure 6 represent cross-sectional antigen overlaps calculated between the four histological subtype-specific classifier lists (calculated with class prediction analysis using 100 RFE, Figure 6). Only a minor proportion (zero to six) of antigens does overlap. In general, there are no common antigens among all four histological subtypes. There is almost no difference in the number of overlapping antigenic proteins based on quantile normalized and ComBat adjusted data. microarrays-04-00162-t005_Table 5 Table 5 Performance of classifier panels calculated with class prediction analysis. Different normalization strategies were used and different subsets of samples were analyzed. The analysis of all samples together means that all four histological types of lung cancer were merged to one cancer class. a greedy-pairs algorithm (GP), recursive feature elimination (RFE); b Nearest Neighbor classification (NN), support vector machine(SVM), Compound Covariate Predictor (CCP), Nearest Centroid (NC), Diagonal Linear Discriminant Analysis (DLDA); c correct classification rate. Figure 6 Venn diagram showing classifier overlaps from class prediction results with respect to histological subtypes. Each circle represents a classifier (100 antigens), specific for each histological subtype, found by means of class prediction analysis using recursive feature elimination (see Table 5) as feature selection method. (a) Quantile-normalized data, (b) ComBat-adjusted data. Moreover, the intersection of histological subtype-specific classifiers calculated with class prediction analysis using 100 RFE between ComBat adjusted data and quantile normalized data is approximately 50%. The 100-antigen-classifiers specified by quantile normalization versus ComBat for SCLC, SqLC, LCLC, and AdCa were overlapping by 56, 45, 46, and 52 antigens, respectively. The intersection of classifying antigens for DWD versus quantile normalization accounts for 13, 15, 18, and 16 antigens, and versus ComBat adjustment for 12, 12, 15, and 12 antigens, for SCLC, SqLC, LCLC, and AdCa, respectively. Classifiers are published and covered in the European Patent Application “Lung cancer diagnostic method and means”, Publication Number EP2806274A1 [43]. Two representative classifiers distinguishing between all cancer cases and cancer-free controls are given in the supplementary section (Supplementary Tables S1 and S2).