4. Normalization
Normalization is an essential step in microarray gene-expression data analysis. It helps to reduce non-biological errors and to convert raw data to valid results. The fundamental assumption of most established normalization methods for high density arrays is that relatively few genes will be dramatically up- or down-regulated compared to the total number of genes, and the intensity measures for the same miRNA population being tested have similar distributions across different slides. However, this assumption is violated for miRNA microarray data, because of the small total number of miRNAs, and the current miRNA microarray platforms possibly do not include enough miRNAs with stable expressions [38].

4.1. Linear Normalization
The data obtained from two slides or differently-dyed samples from the same slide might not be directly comparable. For gene expression arrays, the actual expression level in molecular units can hardly be discerned, and hence, it is hard to calibrate data from different arrays. One common normalization method is linear rescaling. That is, a constant is multiplied to all measures from the same array, so that the expression levels of various arrays can be brought roughly to the same levels. The immediate challenge for this approach is how to find the normalizing constants for different arrays. For high density arrays, if the expression levels of the majority of genes are stable across samples (arrays), rescaling by assuming various profiles have the same median (or trimmed mean) intensity measures works quite well. However, when we have less than 2,000 miRNAs being tested and the majority of miRNAs are weakly or not expressed, the performance of such a normalization method could be quite questionable.
Efforts have been devoted to finding specific controls for miRNA normalization. In ideal situations, controls should be consistently stable and highly abundant despite tissue types or treatments for a specific analytical platform. Also, they should have characteristics similar to miRNAs, including size, biogenesis and stability. Non-coding RNAs (ncRNAs) have been utilized as normalization controls by some arrays, including Exiqon miRCURY LNA miRNA Array, Luminex FlexMIR panel and TaqMan-based qRT-PCR, as well. However, it is found that some ncRNA normalization controls can be influenced by chemo drug treatments, such as 5-FU, Cisplatin or Doxorubicin [37,39,40]. As a result, we need to be aware of the stability of normalization controls across a relatively wide variety of tissues, cell lines and conditions. When a number of normalization controls are available to use, it is recommended to evaluate their stability validation before they are adopted for data normalization. For the FlexmiR bead arrays, we proposed a measurement error model-based algorithm to normalize the intensity measures from the five different pools by using the four normalization beads [41].
Efforts have also been made to normalize miRNA microarray data using “invariants”—a set of miRNAs that are not differentially expressed across arrays [38,42,43,44,45,46]. Because the actual expression levels cannot be determined in different molecular units, it is also challenging to determine whether a set of miRNAs are actually not changing or stably changed across arrays. In other words, if there are two groups of miRNAs having similar performances, it might be tricky to decide which group can be used for normalization. Wang et al. proposed to borrow the strength of another platform and estimate the overall expression pattern of the entire miRNA profile using a panel of representative miRNAs validated with qRT-PCR results [47].

4.2. Nonlinear Normalization
It has been observed that the changes of gene expression levels are nonlinear, especially for those highly expressed genes/miRNAs. Loess normalization is a popular normalization method for miRNA microarrays, which is based on robust local regression of the log ratios of the intensity measures from two arrays (or two differently-dyed samples from the same array) on the overall spot intensities. Variants of the loess normalization method has also been introduced so as to refine the linear scaling part to enhance its performances [48,49,50]. Quantile normalization is another commonly used nonlinear normalization method, which has been successfully migrated to miRNA array data analysis. The quantile normalization method is proposed under an assumption that there is an underlying common distribution of intensities of all miRNAs across arrays [51,52,53]. For miRNA arrays, due to the small total number of miRNAs and the overall low expression level, the ranks of the miRNAs could be greatly affected by the background noises, and hence, the performances of the quantile normalization method could be affected. However, quantile normalization is reported in the literature as one of the best performed normalization methods for miRNA data [37,40,41,45,46,47]. For the FlexmiR bead arrays, due to the small number of miRNAs profiled in the five pools (60, 64,64, 65 and 66), the quantile normalization is not suitable for intra-sample (among pools) normalization. Though, after the sub-profiles have been appropriately assembled, the quantile normalization method has relatively better performance than some other normalization methods [41].