Data Analysis Hypothesized relations among the integrated model constructs were tested in the Australian and U.S. samples separately using variance-based structural equation modeling implemented in the WARP 7.0 analysis package [33]. Model parameters and standard errors (SEs) were computed using the “Stable3” estimation method, which has been shown to provide the most precise parameter estimates in complex structural models in smaller samples and outperforms bootstrapping methods in simulation studies [33]. Simulation studies have also shown this method to provide more consistent and precise estimates in data containing outliers, which may inflate SEs and lead to abnormally high p-values [33]. Two models were estimated in each sample: a model testing predictions of the proposed integrated model with the binary demographic variables also included as covariates (Model 1; Fig. 1, upper panel) and a model that included effects of past social distancing behavior (Model 2; Fig. 1, lower panel). All constructs were latent variables indicated by single or multiple items. There were no missing data for the social cognition and self-reported behavioral variables. There were a few instances of missing data for the demographic variables ranging from 0.5% to 8.8% in the Australia sample, and 0.9% to 6.4% in the U.S. sample. Missing data are reported in Supplementary Appendix B. Missing data were imputed using stochastic hierarchical regression [33]. The analysis afforded a number of analyses to evaluate the adequacy of measures used to indicate the latent variables in the model. Construct validity of the latent factors for the social cognition, intention, and behavioral variables was established using the normalized factor pattern loadings after oblique rotation and Kaiser normalization [33] and the average variance extracted (AVE), which should approach or exceed .700 and .500, respectively. Internal consistency of the factors was estimated using omega reliability coefficients (ω) and composite reliability coefficients (ρ), which should exceed .700 and ideally approach .900. We also conducted tests of the discriminant validity of the constructs in the model. Discriminant validity was supported when the square root of the AVE for each latent variable exceeded its correlation with other latent variables. Adequacy of the proposed model in describing the data was established using the goodness-of-fit (GoF) index with values of .100, .250, and .360 corresponding to small, medium, and large effect sizes. Further information on model quality was provided by the average path coefficient and average R2 coefficient. These indices summarize the average parameter estimates of relations in the model and the amount of variance explained in each dependent variable, respectively, and should be statistically significant for a good-quality model. In addition, an overall GoF index is provided by the average block variance inflation factor for model parameters and the average full collinearity variance inflation factor, which should be equal to or lower than 3.3 for well-fitting models. These indices indicate the extent to which latent variables in the model overlap and contribute to model multicollinearity. They, therefore, provide an indication as to the uniqueness of the existing latent variables in the model. Four further indices were also used to evaluate model quality: the Simpson’s paradox ratio (SPR), R2 contribution ratio (R2CR), the statistical suppression ratio (SSR), and the nonlinear bivariate causality direction ratio (NLBCDR). The SPR indicates whether the model is free from incidences of Simpson’s paradox (i.e., when the path coefficient and the correlation associated with a latent variable have opposite signs), indicating a causality problem. The SPR should exceed .700 and ideally approach 1.000. The R2CR and SSR provide indication of the extent to which models are free from instances of negative R2 contributions and statistical suppression. The R2CR and SSR should exceed .900 and .700, respectively. The NLBCDR provides an estimate of the extent to which the proposed “causal” associations in the proposed model are more tenable than those in the opposite direction and provide an initial indicator of support for the hypothesized directions of the causal links in the proposed model compared to if the proposed direction were reversed. The NLBCDR should exceed .700 for high-quality models. Kock [33] provides further technical details on model fit and quality indices. Model effects were estimated using standardized path coefficients with confidence intervals (CIs) and test statistics. Effect sizes were estimated using a variant of Cohen’s f-square coefficient and represent the individual contribution of the predictor variable to the R2 coefficients of the criterion latent variable. Values of .02, .15, and .35 represent small, medium, and large effect sizes, respectively. Differences in the path coefficients in the models across the samples were tested using multiple-group analysis using the Satterthwaite method with two-tailed significance tests. We also tested whether the inclusion of participants that were never under a “shelter-in-place” order, or had the “shelter-in-place” order lifted during the study, affected predicted relations in the models. The small numbers of participants that were, at some point, not subjected to “shelter-in-place” orders meant we could not conduct a formal moderator analysis, so we conducted a sensitivity analysis testing whether effects in the models differed if data from these participants were excluded. Models excluding and including past behavior were estimated in samples excluding participants who were never subject to a “shelter-in place” order, and in the sample that were never subject to an order, or who had the order lifted at some stage during the study. Formal comparisons of parameter estimates in these models with those from the full sample were made using the Satterthwaite method. Data files, analysis scripts, and output files for all analyses are available online: https://osf.io/x9tms/.