3. Results and discussion   3.1. Overall structure   Crystals of the hMLH1 NTD formed in space group P64 with one molecule in the asymmetric unit. The crystallized hMLH1 construct contained residues 1–340 of the full-length protein. The crystallographic model included amino-acid residues 3–85, 98–299 and 320–336. Atoms with little or no electron density were deemed to be disordered and were omitted from the final model. Also included were ADP, an Mg2+ ion, 35 water molecules and nine sites with electron densities that we failed to confidently interpret in terms of specific chemical features. These sites are designated ‘UNX’ in the coordinate file (unknown atoms or ions). A DALI search (Holm & Rosenström, 2010 ▸) identified the E. coli MutL NTD (LN40; Ban & Yang, 1998 ▸) as the closest structural homolog (Fig. 1 ▸). Superimposition of our structure with the E. coli MutL–Mg–ADP ternary complex (PDB entry 1b62) using CEAlign (Jia et al., 2004 ▸; Shindyalov & Bourne, 1998 ▸) matches 288 Cα positions with a root-mean-square deviation (r.m.s.d.) of 2.5 Å. Given the similarity to E. coli MutL NTD and to be consistent with the nomenclature established by Ban & Yang (1998 ▸), we designate our structure human LN40 (hLN40). The overall structure of hLN40 can be divided into two subdomains (Fig. 1 ▸), an ATPase domain and a ‘transducer’ domain, connected by a two-helix linker. The ATPase domain (residues 25–207) contains the noncanonical, ATPase Bergerat fold, the core of which is composed of a four-stranded, antiparallel β-sheet (β1–β3 and β5) and three α-helices (αB–αD) (Bergerat et al., 1997 ▸). The fold is essentially identical to the topology observed in E. coli LN40 and identifies MLH1 as a member of the GHKL (gyrase, Hsp90, histidine kinase, MutL) ATPase/kinase superfamily of proteins (Dutta & Inouye, 2000 ▸). The ATP-binding loop between helices αC and αD (residues 74–85 and 98–101) defines the pyrophosphate binding site and is variable in structure and length across the family (Ban et al., 1999 ▸; Prodromou et al., 1997 ▸; Steussy et al., 2001 ▸; Wigley et al., 1991 ▸). In addition to the similarity observed in the overall structure between hLN40 and the MutL structure (Ban et al., 1999 ▸), we also observed the presence of an hLN40 crystallographic dimer similar to that observed in the E. coli MutL–Mg–ADP complex. However, in contrast to the prokaryotic structure, the hLN40 ATP-binding loop is partially disordered, possibly owing to crystal packing. Accordingly, residues 86–97 have been omitted from our model owing to a lack of interpretable electron density. The C-terminus of the ATP-binding loop is part of a conserved GFRGE(A/G)L motif (residues 98–104) that is found in related mismatch-repair proteins (Sehgal & Singh, 2012 ▸) and is an extension of motif III (the ‘G2 box’) conserved in GHKL family members (Mushegian et al., 1997 ▸). Gly98 and Gly101 are positioned adjacent to the pyrophos­phate moiety of the bound ADP, permitting the close approach of ADP to the N-terminus of helix αD. This allows the negatively charged ligand to take advantage of a half positive unit charge that arises from the helix dipole moment (Hol et al., 1978 ▸; Wierenga et al., 1985 ▸). The presence of a glycine-rich motif is consistent with a conserved mechanism that has evolved to play a crucial role in the active site of several nucleotide-binding folds (Saraste et al., 1990 ▸; Walker et al., 1982 ▸; Wierenga et al., 1985 ▸). Residues 228–336 fold separately to form a small α/β barrel at the hLN40 C-terminus, known as the transducer domain (Classen et al., 2003 ▸). This domain is characterized by a ribosomal protein S5 domain 2-like fold (Murzin et al., 1995 ▸) and a left-handed α-helical crossover (αI) between β10 and β11 (Ban et al., 1999 ▸; Cole & Bystroff, 2009 ▸; Richardson, 1976 ▸). A large body of evidence points towards the allosteric regulation of the transducer domain playing a central role in coordinating the downstream functions of GHKLs (Ban et al., 1999 ▸; Corbett & Berger, 2003 ▸, 2005 ▸; Lamour et al., 2002 ▸; Oestergaard et al., 2004 ▸; Wei et al., 2005 ▸; Wigley et al., 1991 ▸). In particular, the ‘QTK’ loop (hLN40 residues 298–320) has been proposed to act as an ATP ‘sensor’ that helps to couple changes in ligand binding and hydrolysis to rigid-body movements and conformational changes in the transducer domain (Wei et al., 2005 ▸). Residues 301–320 in the hLN40 QTK loop are disordered; however, we can infer from MutL structures (Ban et al., 1999 ▸) that Lys311 within the PTK motif should act as the conserved basic, γ-phosphate-sensing residue. Crystallographic studies by both Corbett & Berger (2005 ▸) and Stanger et al. (2014 ▸) highlight the importance of rigid-body motions between the ATPase and transducer domains of GHKLs. In particular, these studies identified several distinct conformational intermediates that exist along the ATP-hydrolysis pathway. However, without further structural and biochemical information on catalytically competent forms of hLN40, it remains to be seen whether these observations represent a unifying mechanism that explains how GHKLs achieve their higher-order functions in the cell. 3.2. Structural basis for the pathogenicity of MLH1 mutations   Structural and functional information may be utilized to determine the pathogenicity of MLH1 mutations identified during genetic testing for hereditary cancer syndromes. Here, we present two such pathogenic variants, c.83C>T (p.Pro28Leu) and c.464T>G (p.Leu155Arg) (Thompson et al., 2014 ▸). Pro28 is a buried residue at the N-terminus of αA in the ATPase domain and is completely inaccessible to the solvent (Krissinel & Henrick, 2007 ▸). The introduction of a Leu at this tightly packed position in p.Pro28Leu is likely to introduce severe steric clashes, given its more extended side chain. Sterically, the most favorable rotamer still shows increased van der Waals (vdW) strain and steric clashes involving Gly54, Gly55, Ile59 and Ile176 that are likely to disrupt the core fold of the protein (Fig. 2 ▸ a). Leu155 is also buried in the α/β sandwich of the ATPase domain, between helix αB and the extended β-sheet (Fig. 2 ▸ b). Substitution by Arg at this position could have two consequences. Firstly, outside an active site or stabilizing secondary-structure element, the introduction of an unbalanced, buried charge is often considered to be destabilizing to protein structure (Kajander et al., 2000 ▸; Waldburger et al., 1995 ▸; Wimley et al., 1996 ▸). Incorporating the most favorable rotamer, the modeled Arg at position 155 is surrounded by a cluster of nonpolar residues (Ala31, Ile25, Ile107 and Val152) and is unable to form hydrogen bonds to nearby side-chain or main-chain atoms. The second structural consequence of p.Leu155Arg relates to the compact space in the center of the α/β sandwich, which imposes a steric constraint on the type of amino acid that can be accommodated at position 155. Compared with Leu, the more extended alkyl-guanidinium side chain of Arg introduces severe steric clashes, which disrupt the architecture of the elements (for example helix αD) that form the active site of the enzyme. Given this structural rationale, we expect the MLH1 structure reported here to be of great clinical utility in the analysis of missense variants found in patients recommended for genetic testing. The structure provides a robust platform, in combination with other strong functional or clinical evidence, to help to determine the clinical effect of loss-of-function mutations. We caution, however, against reliance on this model to predict a benign effect in a clinical setting, as truly pathogenic variants may fall within the ‘normal’ functional range. Therefore, other factors must be considered when a seemingly benign substitution is encountered, including the possibility that a nonsynonymous change may have an effect on mRNA splicing or post-translational modification of the protein.