Results Contribution of Gene Mutations to OGID Using exome or targeted gene analyses, we identified a pathogenic mutation in one of 14 genes in 357 individuals with OGID, giving a diagnostic yield of 50% (Figure 1). By far the most common cause was a mutation in NSD1 (240 cases, 34%), followed by EZH2 (34, 4.8%), DNMT3A (18, 2.5%), PTEN (MIM: 601728) (16, 2.3%), NFIX (MIM: 164005) (14, 2.0%), CHD8 (MIM: 610528) (12, 1.7%), BRWD3 (MIM: 300553) (7, 1.0%), HIST1H1E (5, 0.7%), PPP2R5D (3, 0.4%), (2 cases each) EED (MIM: 605984), GPC3 (MIM: 300037), and MTOR (MIM: 601231), and (1 case each) AKT3 (MIM: 611223) and PIK3CA (MIM: 171834) (Table S1). Among the 323 parent-proband trios, we identified a cause in 191 (59%) of which 179 were de novo mutations and 12 were inherited. Our data allow confirmation that EED mutations cause OGID. Two case reports of individuals with a characteristic phenotype that includes overgrowth have been published.10, 27 We here present two additional cases with a de novo EED mutation. The individuals have the same facial phenotype to each other and to previously reported case subjects, with long, narrow palpebral fissures, telecanthus, and retrognathia. Notably, EED is a direct binding partner of EZH2,28 which has an established role in causing OGID.29 Some role in overgrowth was either known, or has been proposed, for the remainder of these, apart from HIST1H1E (see GeneReviews by Eng in Web Resources).6, 9, 10, 12, 29, 30, 31, 32, 33, 34, 35 HIST1H1E Mutations Cause OGID We present here data showing that certain HIST1H1E mutations cause OGID. Through exome sequencing we identified five unrelated probands—COG0405, COG0412, COG0552, COG1739, and COG1832—with heterozygous HIST1H1E protein truncating variants (PTVs) (Figure 2, Tables 1 and S1). In four probands the PTV had arisen de novo. Parental samples were not available for the fifth child, but she carried the same mutation as one of the children with a de novo mutation. The detection of four de novo HIST1H1E mutations in 710 individuals is highly unlikely to have occurred by chance, as determined from gene-specific de novo mutation rates (p = 5.17 × 10−15). None of the mutations are present in the ExAC dataset, nor in 11,677 exomes analyzed in-house with similar pipelines. These results strongly support HIST1H1E mutations as a cause of OGID. HIST1H1E encodes histone H1.4. In humans, H1.4 is one of 11 H1 linker histones that mediate the formation of higher-order chromatin structures and regulate the accessibility of regulatory proteins, chromatin remodelling factors, and histone-modifying enzymes to their target sites.36, 37 The five mutations we identified cluster significantly (p = 2.0 × 10−9) to a 12-bp region in the carboxy-terminal domain (CTD) that is involved in chromatin binding and protein-protein interactions (Figure 2A).36 PTVs in the intronless histones have been shown to evade nonsense-mediated mRNA decay.38 Thus the OGID-causing mutations are predicted to generate a truncated product. The CTD of linker histones regulate higher-order chromatin structure through neutralization of negatively charged linker DNA.36 The pathogenic HIST1H1E mutations all result in the same shift in the reading frame and are predicted to generate similar truncated proteins, with a reduced net charge of 7–9 (compared to 44 for the wild-type protein) (Figure 2A). The mutant protein is thus likely to be less effective in neutralizing negatively charged linker DNA. Moreover, the truncation of the C-terminus likely impedes DNA binding and protein-protein interactions. It is also noteworthy that the other possible alteration in reading frame would reduce neither the net charge nor the length of the protein (Figure 2A). Taken together, these data suggest that specific HIST1H1E mutations, restricted in position and type, cause human overgrowth. HIST1H1E Clinical Phenotype Individuals with HIST1H1E mutations had similar facial appearance in childhood with full cheeks, high hairline, and telecanthus (Figures 2B–2D). Height, head circumference, and degree of intellectual disability were variable, as were the additional clinical features. It is currently unclear whether these additional features are HIST1H1E associations or coincidental findings. Individual case descriptions are below. COG0405, a female individual, was born at term with a weight of 3.58 kg (+0.1 SD) and a length of 53 cm (+1.5 SD). She was floppy in the neonatal period. A brain MRI scan at 4 months demonstrated mild ventricular dilatation but no other abnormalities. Her bone age at chronological age of 7 months was advanced to 18–24 months. By 19 months, her length was 87 cm (+2.0 SD) with a weight of 13.4 kg (+1.8 SD) and she had developed a strabismus. At 13 years of age, the individual was noted to have normal growth with a height of 150.8 cm (−0.6 SD), a head circumference of 55.8 cm (−0.5 SD), and a weight of 48.85 kg (+0.4 SD). She has developed a severe kyphoscoliosis for which she required surgery and has a mild intellectual disability. COG0412, a male individual, was born at 1 week after term following an uncomplicated pregnancy and delivery. He weighed 4.75 kg (+2.4 SD). In the neonatal period he was noted to be floppy; he had poor feeding and undescended testes. At 1.5 years he was very tall at 105 cm (+8.3 SD) with a weight of 18.8 kg (+4.6 SD) and a head circumference of 52.5 cm (+2.6 SD). He was reported to have multiple nevi and redundant skin on the palms of his hands. He had a moderate intellectual disability and no behavioral issues at that time. When he was reviewed at 15.5 years, he was no longer tall with a height of 166.5 cm (−0.6 SD). His head circumference was 58.7 cm (+1.4 SD). By this age he had developed an anxiety disorder that was refractory to medical treatment. He had also developed phobias. In addition, he had major dental problems with crumbling teeth and he had dry, flaky nails. COG0552, a female individual, was born at term with a weight of 4.79 kg (+2.5 SD) and length of 57 cm (+3.6 SD). She was floppy in the neonatal period with poor feeding. She developed no new medical problems in childhood. At the age of 4.2 years she was reported to be delayed in her development. She had a height of 108 cm (+1.2 SD), head circumference of 55 cm (+3.2 SD), and weight of 24 kg (+2.7 SD). COG1739, a female individual, was initially thought clinically to have Weaver syndrome. She was born at 37 weeks after an uncomplicated pregnancy and labor with a weight of 3.25 kg (+0.8 SD), length of 49 cm (+0.7 SD), and head circumference of 37 cm (+3.3 SD). She was hypoglycemic and hypertonic in the neonatal period, and was also noted to have camptodactyly. At 1.9 years she was diagnosed with a moderate intellectual disability and had a height of 85 cm (mean), head circumference of 51 cm (+1.8 SD), and weight of 12 kg (−0.3 SD). COG1832, a male individual, was born at 1 week after term weighing 3.74 kg (+0.4 SD). The pregnancy had been complicated by exposure to chicken pox. At birth, COG1832 was noted to have talipes equinovarus and later in the neonatal period was diagnosed with delayed visual maturation. A brain MRI scan showed a slender corpus callosum and unusual ventricular outline, possibly indicative of a periventricular leukomalacia. At 8.5 years, height was 133.2 cm (+0.5 SD) with a weight of 33 kg (+1.2 SD). The head circumference at 6.3 years was 59 cm (+3.7 SD). He has limited speech but with verbal comprehension markedly ahead of this ability to express himself. He has left amblyopia and astigmatism. His hearing is normal. He suffers from constipation. At times his behavior is challenging. Functional Network Analyses To investigate the biological processes abrogated by OGID pathogenic mutations, we performed functional enrichment analysis using the GO molecular function terms and KEGG pathway gene sets in g:Profiler.24 The chromatin binding (FDR q value = 1.58 × 10−6) and PI3K/AKT signaling pathway (FDR q value = 6.80 × 10−5) gene sets were significantly enriched. Six genes—NSD1, EZH2, DNMT3A, EED, CHD8, and HIST1H1E—were in the chromatin binding gene set. All encode proteins involved in epigenetic regulation (Figure 3A). NSD1 is a histone methyltransferase that catalyzes methylation of H3K36, and to lesser extent H4K20, and is primarily associated with transcriptional activation.39 EZH2 and EED are key components of the polycomb repressive complex 2 (PRC2), which catalyzes methylation of H3K27, resulting in transcriptional repression of target genes.28 DNMT3A is a DNA methyltransferase crucial for the establishment of new methylation marks during early embryogenesis and the sex-dependent methylation of imprinted genes.40, 41 CHD8 encodes an ATP-dependent chromatin remodeler that binds to methylated H3K4, a key histone modification at active promoters.35 As noted above, H1.4 binds to linker DNA between nucleosomes and has key roles in chromatin compaction and regulation of gene expression.37 Together, mutations in these six genes accounted for 311 (44%) of our series. Disruption of epigenetic regulation is therefore a prominent molecular mechanism underlying OGID (Figure 1). Five of the genes—PTEN, AKT3, PIK3CA (which encodes p110α, the catalytic domain of the heterodimeric PI3K lipid kinase), MTOR, and PPP2R5D (which encodes B56δ a regulatory subunit of the heterotrimeric PP2A protein phosphatase)—are in the PI3K/AKT pathway, which plays a key role in the regulation of growth (Figure 3B). Activation of the PI3K/AKT pathway results in cellular growth promotion through increased cell metabolism, cell survival, cell turnover, and protein synthesis.42 Together mutations in these genes made only a minor contribution to our OGID series (23 case subjects, 3.2%). In part this is because individuals with mutations in these genes are more often diagnosed with other conditions, such as Cowden syndrome (MIM: 158350), megalencephaly-capillary malformation syndrome (MIM: 602501), or regional overgrowth (see GeneReviews by Eng in Web Resources).34 The remaining three genes—NFIX, GPC3, and BRWD3—encode a transcription factor, a proteoglycan, and a bromodomain-containing protein, respectively6, 31, 32 (23 case subjects, 3.2%). There is currently no clear functional link between these genes and the other genes we report here. However, it is possible that BRWD3 mutations also cause overgrowth through epigenetic regulation dysfunction, as there are data suggesting it is involved in histone H3.3 regulation.43 Phenotype Analyses There was enrichment of mutations in individuals with both increased height and head circumference, compared to individuals in whom only one growth parameter was increased, as would be expected. Specifically the diagnostic yield in individuals with both macrocephaly and increased height was 59% (120/205), significantly higher than the diagnostic yields in individuals with only macrocephaly (43%, 47/109, p = 0.006) or only increased height (45%, 62/138, p = 0.009). There was no significant difference between the diagnostic yields in individuals with only macrocephaly and in those with only increased height (p = 0.146). There was also no significant difference between the diagnostic yield in individuals with unspecified growth parameters (50%, 130/258) and any other group. To further explore the phenotypic spectrum of OGID, we compared the growth and intellectual disability severity of the individuals due to mutations in the epigenetic regulation genes and those involved in the PI3K/AKT pathway, using case subjects for which the relevant phenotypic information was available (217 individuals with complete growth data and 263 individuals with intellectual disability severity information) (Figure 4). Macrocephaly (i.e., head circumference ≥2 SD above the mean) occurred more frequently in individuals with PI3K/AKT pathway gene mutations; all 17 had macrocephaly, compared with 140/200 individuals with OGID due to epigenetic regulation gene mutations (p = 4.1 × 10−3; Figure 4A). Furthermore, 9/17 of the PI3K/AKT pathway case subjects had macrocephaly without increased height compared with 32/200 of the epigenetic regulation pathway cases (p = 1.0 × 10−3; Figure 4A). The remaining 60/200 had increased height without macrocephaly, a combination not present in OGID due to PI3K/AKT pathway gene mutations (p = 4.1 × 10−3; Figure 4A). Varying severity of intellectual disability was a feature of both groups, but mild intellectual disability was more common in OGID due to PI3K/AKT pathway gene mutations (14/20) than OGID due to epigenetic regulation gene mutations (101/243; p = 0.01) (Figure 4B). The risk of childhood cancer is one the most controversial areas of OGID management. 8/710 OGID-affected individuals in this study developed cancer in childhood (Table S1). This includes 4/357 with an identified genetic cause, three of whom had an EZH2 mutation. COG1724 developed neuroblastoma at 46 months, COG0285 developed T cell non-hodgkins lymphoma at 13 years, and COG1521 was diagnosed with both neuroblastoma and acute lymphoblastic leukemia at 13 months. The childhood cancer incidence for EZH2 mutation carriers in this study was thus 9% (3/34). The remaining child had an NSD1 microdeletion and T cell non-hodgkins lymphoma. This information will be useful in family discussions about childhood cancer risk, particularly in relation to surveillance strategies, which are generally of unproven benefit and can be associated with appreciable false positive rates.44 Height GWAS Loci Comparative Analyses We next explored the overlap between the 14 genes and 611 genes implicated through genome-wide association studies (GWASs) to be involved in the control of human height.25 There was significant overlap; six genes involved in OGID were also present in height GWAS regions (p = 6.8 × 10−8) (Figure S1). The overlap is primarily through the epigenetic regulation genes, all of which (except EED) were represented in height GWAS regions. Two separate intronic SNPs in each of NSD1 and DNMT3A were independently associated with height in the GWAS and there were no other genes within the linkage disequilibrium (LD) blocks of association. This strongly suggests that NSD1 and DNMT3A functional impact underlie the height association in these regions (Figure S1). Single SNPs in intron 5 of CHD8, intron 9 of MTOR, 1 kb downstream of HIST1H1E, and 48 kb upstream of EZH2 were also associated with height.25 For HIST1H1E and EZH2, there were no other genes in the LD block of association. For MTOR the variant associated with a cis-eQTL affecting MTOR expression, though the association was better accounted for by an upstream variant (rs2295080) in the MTOR promoter region that was in LD with the height SNP (LD r2 = 0.85).25 Although the causal SNPs and mechanisms of association are not fully elucidated, these data suggest that common variation in some genes involved in OGID also influence height at a population level. Cancer Somatic Driver Mutation Comparative Analyses Dysregulated cellular growth is a hallmark of cancer, and certain human conditions are associated with both overgrowth and increased cancer risk (see GeneReviews by Eng in Web Resources).45 We therefore next sought to investigate the overlap between the 14 genes and 260 somatically mutated cancer driver genes reported by Lawrence et al.26 There was significant overlap; 8/14 genes involved in OGID were somatically mutated in a diverse range of cancers (NSD1, EZH2, DNMT3A, PTEN, CHD8, HIST1H1E, MTOR, PIK3CA; p = 1.7 × 10−14). For the PI3K/AKT pathway genes, the mutation spectra are similar in OGID and cancer.34 By contrast, for the epigenetic regulation genes, the mutation spectra in OGID and cancer have substantial, distinctive differences. Somatic mutations in HIST1H1E, EZH2, and DNMT3A occur in hematological malignancies.26, 46, 47, 48, 49, 50 HIST1H1E and EZH2 mutations are each present in ∼20% of B cell lymphomas.48, 49 Somatic HIST1H1E mutations are nonsynonymous mutations throughout the gene and do not include the clustered PTVs that cause OGID (Figure 5). EZH2 mutations in B cell lymphomas are often activating nonsynonymous mutations in the SET domain, the majority of which target a single amino acid, p.Tyr646.48 Nonsynonymous mutations at this residue have not been detected in OGID and are not present in ExAC, perhaps suggesting that germline EZH2 mutations altering p.Tyr646 are not compatible with life (Figure 5). Inactivating EZH2 mutations are present in myeloid malignancies and in T-ALL.46, 47, 48 A proportion of these latter mutations overlap with EZH2 mutations in OGID. DNMT3A is one of the most frequently mutated genes in AML and mutations also occur less frequently in other hematological malignancies.26, 50 The majority target a single residue, p.Arg882, with the remainder being nonsynonymous variants and PTVs scattered through the gene. Mutations at p.Arg882 have not thus far been reported in OGID (Figure 5). Protein modeling suggests that the somatic mutations primarily impact DNA binding, whereas the mutations in OGID are more likely to impact histone binding.12 Somatic NSD1 mutations are seen in ∼10% of head and neck squamous cell carcinomas26, 51 and somatic CHD8 mutations are present in ∼3% of glioblastoma multiforme (GBM).26 For these cancers the mutation pattern is similar to that observed in OGID, with PTVs being the most frequent mutation type (Figure 5).30 Interestingly, Lawrence et al. found NSD1 and CHD8 to each be significant in their pan-cancer analysis, present in 2% of cancers.26 However, the pan-cancer mutation spectra for each gene was different to that observed in OGID, with most being nonsynonymous mutations scattered throughout the gene (Figure 5).