2.3. Data Pre-Processing

2.3.1. Affymetrix GeneChip© miRNA Array
Intensity .CEL files were obtained from the scan images and imported to the Affymetrix© miRNA QC Tool software (Version 1.0.33.0) to quantify the signal value. Quality control (QC) was assessed by plotting the average intensity of the oligo spike-in and background probe sets (included in the control target content) across all of the arrays. According to Genisphere, oligo spike-in 2, 23, 29, 31 and 36 probe sets should present a value of more than 1000 intensity units to accept array quality. The miRNA arrays were detected using the Affymetrix detection algorithm, based on the non-parametric Wilcoxon rank-sum test, applied independently on each array and probe/probe set; a p-value greater than 0.06 stands for “not detected above background” [19]. For data normalization, the “default” method was used, obtaining log2 expression values (expression values data matrix) from the raw data (intensity values data matrix). Briefly, this method involved the following three steps: grouping the background probes intensities based on GCcontent, where the median intensity of each bin was the correction value for each probe with the same GC content; a quantile normalization and, finally, a median polish summarization. To obtain a single intensity value for each miRNA mapped on the log2 array, intensity measures for replicated spots were averaged.

2.3.2. Agilent Human miRNA Microarray (V1)
Images were scanned using the Agilent Feature Extraction (AFE) software (version 11.0.1.1), obtaining the total gene signal (TGS) for all miRNAs on the array. Negative values were transformed by adding, for each array separately, the absolute value of the minimum TGS intensity on the array as extracted by the AFE + 2 before log2 transformation [20]. Data extracted from AFE were imported in the R environment [21] and processed using the AgimiRNA package, available in Bioconductor [22,23].

2.3.3. Illumina HumanMI_V2
Raw data were processed using the proprietary BeadStudio software (Version 3.3.8). No background subtraction was performed. Probe-level data were summarized to obtain miRNA-level data, and then, a log2 transformation was applied.

2.3.4. miRNA Selection and Normalization
The Affymetrix platform contained information on 7815 miRNAs, of which 847 (10.83%) were human, whereas Agilent platform content was of 961 miRNAs (851 human, 88.55%), and on the Illumina array 1145 miRNAs were detected, of which 858 (74.93%) were human miRNAs. Human miRNAs common to all platforms were selected according to their name and confirmed by a search on miRBase (Release 18, November 2011). Unlike other published works, miRNAs were not filtered on a detection basis, because such an approach could possibly introduce a bias in the results. In fact, some of the miRNAs that are filtered out because they are “switched-off” could show patterns of within- and/or between-platform disagreement in another experiment, where they are “turned-on”. This could possibly lead to an over-estimate of the level of reliability. Considering only human miRNAs should circumvent this issue and, at the same time, provide relevant information, since human miRNAs are commonly those that are of major interest in biomolecular investigation.
Moreover, no data normalization was performed. Almost all works that focused on comparing microarray platforms normalized their data (for instance, [15,16,24]), but this is a non-trivial issue that has to be carefully evaluated. As a matter of fact, to date, normalization for miRNA microarray has been largely debated, with results that have been somehow discordant [25,26,27,28,29], so that no “gold-standard” methods exists. Additionally, normalizing data in the context of assessing platform agreement poses other relevant problems. If data on two different platforms are normalized and then compared, then there is no way to discriminate between platform and normalization on the results of concordance/agreement/reproducibility assessment. A high level of between-platforms agreement, due not to the platforms themselves, but to the normalization used, might be found. On the other hand, the same normalization on different platforms could highlight patterns of discordance that cannot be ascribed to the platforms. Nonetheless, comparing un-normalized data exposes the risk of finding poor concordance, because of incidental batch effects occurring in the experiment, which may lead to an underestimate of the “true” agreement between platforms. In this paper, we have chosen to use non normalized data, so that we could assess the performance of different platforms “per se”. For the sake of comparison, data were also normalized with the quantile and loess algorithm, and results were compared to those obtained on non-normalized data.