Comparisons with other modeling approaches Some (not all) of the dimensional reduction methods for ESM discussed here have been sporadically and individually applied to real data in the literature. For example, a popular practice with ASM is to solely focus on the coefficient of the principal basis function (e.g., canonical curve in SPM) with other coefficients (e.g., time and dispersion derivatives) being a priori abandoned. As our results with real data showed, the investigator may fail to detect most activations when the effect lies in the HDR shape nuances but not the peak. One suggestion for ASM was to extend the definition of amplitude in (6) to the L2-distance by including either the effect for the time derivative (7) or the effects for both time and dispersion derivatives (8) (Calhoun et al., 2004; Worsley and Taylor, 2006; Steffener et al., 2010). A similar approach was to express the effect estimates from the first two basis functions of ASM as a complex number (Wang et al., 2012). However, the potential issues with L2D or its analogs (e.g., Worsley and Taylor, 2006) are the following. a) The definition of amplitude extension in (7) and (8) is under the premise that all the three basis functions are orthogonal with each other (Calhoun et al., 2004). However, only the first two basis functions are orthogonal with each other, but not the third one. b) The second and third basis functions are not normalized; that is, they are not scaled to have a maximum value of 1, unlike the first basis function. In addition, the three effect estimates have different dimensions: the first is of percent signal change while the other two of percent signal change by the unit of time. Therefore, it is difficult to render a physically meaning interpretation with the L2D measures. c) All the effect estimates including negative values are folded into a positive L2D measure, which cannot be differentiated among those effect estimates on the same circle or sphere (see Table 1). In addition, it may lead to the violation of the Gaussian distribution assumption, as illustrated in the poor controllability of FPR (Figure 1). d) Their power performance is not satisfactory (Figures 1, 2). As an alternative, MVT or LME through the hypothesis (2a) or (2b) on the two or three effect estimates from ASM, as shown in Figure 2A, provides a more accurate characterization because it allows for different units or dimensions across the effects. Similarly for ESM, two dimensional reduction methods have separately been adopted in data analyses. For example, AUC was employed in Beauchamp et al. (2003), Greene et al. (2007), and McGregor et al. (2013). Although not explicitly stated, XUV was used in several real applications to identify the HDR effect under a condition through the main effect (or one-way interaction) of the ESM components in a one-way within-subject ANOVA (Weissman et al., 2006; Geier et al., 2007; Church et al., 2008), to detect the group or condition differences in the overall HDR shape through the group-by-component or condition-by-component interaction in a two-way ANOVA (e.g., Schlaggar et al., 2002; Church et al., 2008; Shuster et al., 2014), and to explore the three-way group-by-task-by-component interaction (Church et al., 2008). However, two limitations were not addressed in those analyses: the potential identification failure of XUV (Table 1 and Voxel 2 in Figure 2), and the limited applicability of univariate GLM. Some comparisons were performed in terms of amplitude, peak latency, and duration in the estimated HDR among various modeling methods (e.g., FSM, L2D, ESM, a nonlinear model, and inverse logit model; Lindquist et al., 2009). The inverse logit model was deemed the best among the candidates in both simulations and real data, and slightly more powerful than ESM. However, the comparisons were not optimal. First, the dimensional reduction from the HDR shape in ℝm to the three quantities (amplitude, delay, and duration) in ℝ3 might be compromised in power when detecting the shape subtleties—this point can be highly dependent on the experiment. Secondly, the reliability for the estimation of the three characteristics was suboptimal. For example, the lackluster performance of ESM in Lindquist et al. (2009) might be caused by the inaccurate amplitude based on the first local peak because such an approach could be misleading especially when more than one local peak occurs. Lastly, the final group analyses were still focused on the amplitude with the Student's t-test, an effective dimensional reduction from ℝm to ℝ1. A multivariate approach (Zhang et al., 2012) was previously proposed, analogous to our method except for the following differences. It was demonstrated among the voxels within only five structurally pre-defined regions; smoothing the estimated HDR from each subject by a Gaussian kernel and imposing regularization on the smoothed HDR were performed to improve the temporal continuities of the HDR; and group analysis was run through multivariate testing of one-sample or pair-wise comparisons among conditions, equivalent to MVT (2a or 2b) discussed here. Another approach (Zhang et al., 2013) assumed that the HDR under each condition would only vary in amplitude and latency across subjects; that is, the HDR shape was presumed same across all subjects. Specifically, the HDR curve for each condition was characterized at the group level by two parameters: one was of interest (amplitude) and the other of no interest (delay). In addition, the HDR shape (fixed across subjects) was modeled by cubic splines plus their time derivatives. Once the amplitude was estimated for each subject in a one-tier model that incorporated both within- and across-subject variances, a second round of group analysis was performed only on the amplitudes (ignoring the delay) through typical one-sample or paired t-test to make inference about a condition or contrast. The approach was demonstrated among the voxels within only three structurally predefined regions. Recently, a hierarchical approach was proposed for ESM through integrating both individual and group levels into one model (Degras and Lindquist, 2014) in which the HDR curves were captured through multiple higher-order B-spline functions. Even though only demonstrated on one slice of data, the approach is appealing because the variability at both levels is accounted for. However, the current implementation in Matlab is hindered by the following constraints or limitations. a) Spatial parcellation based on anatomical structure was required to determine the temporal correlation structure in the noise component. More applicable approaches would be based on a priori regions that are functionally parcellated through, for example, hierarchical clustering (Thirion et al., 2006; Ji, 2010), joint parcellation detection-estimation (Badillo et al., 2014), consensus clustering (Badillo et al., 2013), k-means clustering (Ji, 2010), etc. (b) The HDR shape may vary across different stimulus conditions under some scenarios (e.g., Ciuciu et al., 2003), and a presumption of the same shape HDR as in Degras and Lindquist (2014) may decrease the detection power when the shape subtleties are of interest. The same HDR assumption is reasonable under other circumstances and has proven sufficient for encoding or decoding the brain activity (Pedregosa et al., 2015). c) Final statistical inference in Degras and Lindquist (2014) through an asymptotic t-test was still based on the scaling factors of the same HDR curve shared by all conditions, a dimensional reduction approach from ℝm to ℝ1. An alternative approach is the incorporation of both individual and group levels in a mixed-effects model under the Bayesian framework (Chaari et al., 2013; Badillo et al., 2014). Applied at a priori regions that are functionally parcellated, this jointed detection and estimation method may render a robust procedure less sensitive to outliers than the conventional two-tier methods under the assumption that all the voxels share the same HDR within a region or parcel.