Most Sites in the SARS-CoV-2 Genome Were under Purifying Selection. Using phylogenetically informed models (as described above), we identified two sites, residue 614 in S and 13 in N, that were under diversifying selection in a majority of subsampled alignments. For each protein, subsampled alignments tended to have more sites under purifying selection (median = 7.34 ± 4.06% [±SD]) than under diversifying selection (3.10 ± 1.92%) (Mann−Whitney U test, P = 0.057; SI Appendix, Fig. S4) (purifying selection is indicative of a decrease in genetic diversity in the population). Likewise, for each codon separately, the proportion of each phylogeny (i.e., the percentage of total branch length) with dN/dS > 1 was small, indicating diversifying selection was episodic and limited (Fig. 3A). Global measures of dN/dS varied across genes, ranging from 0.35 ± 0.02 (M) to 1.43 ± 0.24 (ORF10), and were significantly lower for structural genes compared to nonstructural genes (Mann−Whitney U test, P = 0.042) (Fig. 3B). Per-lineage nonsynonymous substitution rates were comparable (Student’s t test, P = 0.218) in structural (0.0011 ± 0.021) and nonstructural (0.0012 ± 0.028) genes, although some subsampled alignments showed rates that could be a hundred times higher than the median over all alignments (Fig. 3C). Across structural proteins, mutations were disproportionately neutral: >70.3% of branch length evolved under neutral (or negative) selection for all sites, and over half of all branch length evolved under neutral (or negative) selection for >82.8% of sites (Fig. 3D) (29). Fig. 3. Evolution across the SARS-CoV-2 genome. (A) Bar plot of the average percentage of branch length under diversifying selection (dN/dS > 1) for each site. (B) Bar plot of dN/dS per gene (dN = dS is shown as dashed line). Error bars indicate SD across subsampled alignments. (C) Box plot of nonsynonymous substitutions per lineage per site across structural and nonstructural genes. Values across subsampled alignments for each gene are plotted. (D) Average percentage (over subsampled alignments) of branch lengths evolving under neutral (or negative) selection per site for each structural gene. Median values are shown by dashed lines.