3.2. Class Comparison Analysis Class comparison with a significance threshold of p < 0.001 was performed after quantile normalization, DWD, and ComBat adjustment of the data and significant antigens were compared between these methods. The overlap between all analyses for investigation of the used normalization and batch removal methods is summarized in Table 3 (details are given in the Supplementary Figure S4). The highest number of significant antigens is achieved upon ComBat and quantile normalization. Much lower numbers are significant when analyzing DWD-adjusted and unnormalized data. microarrays-04-00162-t003_Table 3 Table 3 Number of significant antigens (p < 0.001) resulting from class comparison analysis. Rows show results of different data pre-processing methods, columns indicate the analyzed sample group (“r” means experimental run). Each histological sample group was processed in a distinct run comprising only samples of one subtype as well as in a mixed run comprising samples of two subtypes and controls. Therefore, class comparison analysis could be performed either run-wise on each histological group, including samples derived from a single run or cross-run analysis was done, including samples of the single run combined with the samples of the respective subtype derived from the mixed run. For each histological subtype, overlaps of the significant antigens derived from class comparison analysis including either data of a single run or data from two combined runs were calculated (Table 4). This enables to investigate if the significant antigens derived from the analysis of a single run remain significant in cross-run analysis. For example, the case of the analysis of the SCLC group, class comparison analysis was done in two ways: including only samples of “run 1”, and combining the samples of “run 1” with the SCLC samples of “run 5”. Then, the overlap of the significant antigens was calculated. This was done using data from each histological group, using all three normalization methods as well as unnormalized data. When comparing the three different normalization strategies, ComBat adjustment yields the highest overlap of significant antigens when comparing analyses from single runs to cross-run analyses, ranging from 25.7% for SqLC to 43.4% for SCLC. With quantile normalization the second highest overlaps have been achieved, ranging from 18.3% for SqLC to 41.6% for SCLC. The lowest number of overlaps was achieved with DWD, which has even lower overlaps for the SCLC (13.7%) and SqLC (9.5%) compared to the unnormalized data (17.8% and 10.0%, respectively). Interestingly, for the histological group of LCLC the normalization method did not influence the overlap at all, ranging from 38.9% to 39.6%. The overlap for AdCa is comparable for all three adjustment methods (30.0% for DWD, 31.7% for quantile normalization, and 38.3% for ComBat). microarrays-04-00162-t004_Table 4 Table 4 Overlap (%) of significant antigens between single-run and cross-run class comparison analysis (p < 0.001). This calculation was done with the same data set in an unnormalized state, quantile-normalized, DWD-adjusted data, and ComBat-adjusted data. SCLC = small cell lung cancer, SqLC = squamous cell lung cancer, LCLC = large cell lung cancer, AdCa = adenocarcinoma. Comparison of congruent significant antigens derived from class comparison analysis (p < 0.001) with different normalization methods revealed that a high proportion of antigens are shared between quantile normalization and ComBat adjustment. This can be observed when analyzing cases and controls derived from one run (exemplified for SCLC in Figure 5a) compared to two processing runs (exemplified for SCLC in Figure 5b). For SCLC, all antigens which are significant in the unnormalized data occur in at least one list of significant antigens of a pre-processed data set. The number of significant antigens in the unnormalized and DWD-adjusted data is very low compared to the quantile-normalized and ComBat-adjusted data. However, a relatively high number of antigens is shared by all four differently pre-processed data sets; 15 antigens in the case of the run-wise analysis and 45 antigens in the case of the cross-run analysis (Figure 5a,b). This can also be observed for the LCLC group, while in the SqLC and the AdCa group the number of overlaps between all four methods is relatively low (see Supplementary Figure S5). Figure 5 Overlap of significant antigens derived from class comparison analysis (p < 0.001) with different normalization methods for the SCLC group. Data was analyzed unnormalized, quantile normalized (QN), DWD-adjusted, and ComBat-adjusted. Significant antigens of SCLC versus controls from (a) single run analysis (“run-wise” analysis) and (b) analysis of two combined experimental runs (“cross-run” analysis). Highest numbers of overlapping antigens are observed between QN and ComBat (318 and 256, respectively). (c) Excerpt of Supplementary Figure S4 showing relative overlaps (% representing the intersections of findings depicted in Figure 5a,b) of all class comparison analyses results between quantile normalization and ComBat adjustment. Comparing relative intersections, the overlaps of significant antigens between quantile normalization and ComBat adjustment ranges from 74.9% to 76.8% for single-run analyses and 42.9% to 62.3% for cross-run analyses (Figure 5c). Considering the overlaps between all adjustment methods as depicted in Supplementary Figure S4, quantile normalization and ComBat adjustment yield the highest values. The sample size plug-in from BRB-ArrayTools was used to calculate the number of samples needed per sample class for identifying antigens that are differentially reactive between controls and histological subtypes. For this calculation, we selected an accepted type 1 error rate (α) of 0.001, a power (1-β) of 0.9 and a mean difference in log2 expression between classes of 0.585 (equates a fold-change of 1.5). An estimated required sample size of 21, 21, 21, and 20 for SCLC, SqLC, LCLC, and AdCa, respectively, was calculated when using the 50th percentile of the variance distribution. Therefore a good statistical power is given when using 25 samples per group.