3.1. Assessing RNA Quality Good RNA quality is an important prerequisite for obtaining reliable results from a microarray gene expression experiment. RNA quality propagates from the hybridized sample to the obtained gene expression estimates and consequently also to differential expression results. Combined with the previous detection of a noticeable degradation effect upon the majority of microarray samples [31], variability of RNA quality has a high risk of being a major source of technical bias. The RNA Integrity Number (RIN) provides a measure for RNA quality that is determined for most microarray samples before hybridization [32]. RIN values scale between 1 and 10, and using only samples with RIN ≥ 7 is recommended [33]. However, RIN values are unfortunately seldom stored in conjunction with the experimental data. The so-called dk degradation index provides a sensitive estimate of RNA-quality that can be computed from raw GeneChip microarray data. It was shown to correlate well with RIN [26]. For fresh tissue, the cutoff of RIN ≥ 7 corresponds to a cut off of the degradation index of dk ≥ 0.45. Note that dk values are not only sensitive, but also specific for RNA quality since only RNA degradation and amplification have such a systematic effect on this value estimated by means of the probe intensity decay with increasing distance between the probe location and the transcription start site [26]. We have computed the dk values for all 8131 samples of the HumanArraySet which were either included or excluded from the HumanExpressionAtlas data set as described above. Figure 1a shows the resulting density distribution of dk for the qc-included/qc-excluded sample sets. Most samples included after quality control have a degradation index between 0.5 ≤ dk ≤ 0.8 referring to acceptable RNA quality. On the other hand, a large fraction of the qc-excluded samples exhibits values of dk < 0.45 referring to critically low RNA quality. This applies to 25% of the qc-excluded samples and to 10% (868) of all investigated samples. Note that some of these samples might be part of an experiment that is entirely based on low input RNA and multiple rounds of amplification, respectively. Such an experiment can provide valuable information when analyzed individually. Figure 1 Variation of RNA quality among a large set of microarray samples and its impact on expression results. Panel (a) shows the density distribution of dk values measuring the degradation for samples either included or excluded in the HumanExpressionAtlas data set by independent quality control. The red line indicates the low quality threshold corresponding to RIN ≤ 7. The inset shows RNA degradation plots for two selected samples. The corresponding dk values (red and blue dots) are shown in the inset and in the density distribution, respectively; Panel (b) shows the first two principal components of the HumanExpressionAtlas expression space where sample points are colored according to their RNA quality measure dk. Furthermore, 3% (162) of the qc-included samples are so severely degraded that they should have been excluded by RIN analysis. Expression estimates of these samples are biased, with negative consequences for the reliability of downstream results. That these samples were included in the HumanExpressionAtlas suggests that a more rigorous assessment of RNA-quality should be applied in quality control procedures. Note that our results agrees well with a previous estimation that about 2% of samples of large microarray series have low RNA-quality [34]. Interestingly, only few qc-included samples have dk values larger than 0.8, which obviously represents an upper limit referring to the “weakest possible intensity decay”. This limit can be attributed either to the insufficiency of the clean-up assays to fully stop RNAase activity or, alternatively, to the incomplete amplification of aRNA fragments. On the other hand, a fraction of 8.1% of the qc-excluded samples has values of dk > 0.8 which could be also due to uncertainties in estimating dk at large signal noise levels [26]. We next investigate how the dk parameter changes together with different batches of samples for which we suspect technical variation. To this end, sample groups were defined using a combination of experiment id and experimental date as extracted from the HumanExpressionAtlas. For the experimental date we use the month in which the microarray sample has been hybridized, which can be extracted from Affymetrix raw data files. Supplementary Figure S1 shows the dk parameters for the samples ordered by these batches. Using a simple linear model we obtained a generalized R2 statistic with a value R2 = 0.67 indicating that a considerable covariation between the batches and the degradation index dk. To test for confounding of the HumanExpressionAtlas with the effect of variable RNA quality we plot its first two principal components of the “expression space” referring to the pre-processed data as available in ArrayExpress in Figure 1b. The same representation was shown in [7] where the data points were however colored according to biological groups. It could be shown that the first principal axis (“hematopoietic axis”) differentiates the biological groups hematopoietic system, solid tissues, connective tissues and incompletely differentiated cell types, and that the second principal axis (“malignancy axis”) differentiates cell lines, neoplasms and normal/non-neoplastic disease tissues. Here, the samples are not colored according to biological groups but instead according to the value of the degradation index dk. The resulting plot in Figure 1b shows a clear color gradient along the second principal axis with, in general, lower RNA quality (red spots) at the top and higher RNA quality (yellow spots) at the bottom. The sample groups “normal/non-neoplastic disease tissues” are here associated with better RNA quality and neoplasms with worse RNA quality. Altogether, the correlation coefficient between the degradation index dk and the second principal component is r = −0.44. Hence, on average the RNA used for hybridization lost quality with increasing malignancy of the samples.