Recurring and Adaptable Binding Motifs in Broadly Neutralizing Antibodies to Influenza Virus Are Encoded on the D3-9 Segment of the Ig Gene
Abstract
Highlights d Structure of bnAb S9-3-37 bound to influenza hemagglutinin stem was determined d D3-9-encoded region of S9-3-37 contributes majority of the interaction surface with HA d D3-9 gene segment of S9-3-37 can engage the HA stem in two different reading frames d D3-9 gene segment represents a recurring mechanism for antibody targeting of HA stem SUMMARY Discovery and characterization of broadly neutralizing antibodies (bnAbs) to the influenza hemagglutinin (HA) stem have provided insights for the development of a universal flu vaccine. Identification of signature features common to bnAbs from different individuals will be key to guiding immunogen design. S9-3-37 is a bnAb isolated from a healthy H5N1 vaccinee. Here, structural characterization reveals that the D3-9 gene segment of S9-3-37 contributes most of the interaction surface with the highly conserved stem epitope on HA. Comparison with other influenza bnAb crystal structures indicates that the D3-9 segment provides a general mechanism for targeting HA stem. Interestingly, such bnAbs can approach the HA stem with vastly different angles and orientations. Moreover, D3-9 can be translated in different reading frames in different bnAbs yet still target the same HA stem pocket. Thus, the D3-9 gene segment in the human immune repertoire can provide a robust defense against influenza virus.
Identifying signature features common to broadly neutralizing antibodies (bnAbs) is key to universal flu vaccine design. Wu et al. report that the D3-9 encoded segment of an influenza hemagglutinin stem-targeting bnAb contributes the majority of the interaction surface and is a recurring motif in antibodies that target the hemagglutinin stem.
Development of a universal influenza vaccine has been of longterm interest for global heath. In the past decade, the discovery and characterization of broadly neutralizing antibodies (bnAbs) to the influenza hemagglutinin (HA) stem have provided invaluable insights for development of a universal influenza vaccine (Impagliazzo et al., 2015; Wu and Wilson, 2018; Yassine et al., 2015) . The HA stem domain, unlike the receptor-binding head domain, is highly conserved across influenza strains and subtypes (Nobusawa et al., 1991) . Therefore, antibodies that target the HA stem usually have much greater breadth than those that target the hypervariable head (Wu and Wilson, 2017) , which is the location of the major antigenic sites (Gerhard et al., 1981; Wiley et al., 1981) . Consequently, elicitation of stem-binding bnAbs has become a major goal for universal influenza vaccine development (Erbelding et al., 2018) .
Understanding features that are common to bnAbs isolated from different individuals, whether through natural infection or vaccination, is important for immunogen design in the development of a universal influenza vaccine. As more and more influenza bnAbs are characterized, several multidonor classes of bnAbs have begun to emerge (Avnir et al., 2014; Joyce et al., 2016; Wu and Wilson, 2017) . Stem-binding bnAbs that utilize V H 1-69 germline genes are probably the most well-characterized multidonor class (Dreyfus et al., 2012; Ekiert et al., 2009; Kashyap et al., 2008; Lang et al., 2017; Sui et al., 2009; Throsby et al., 2008) , and the alleles used code for a signature motif consisting of Ile53 and Phe54 in complementarity-determining region (CDR) H2 and Tyr98 in CDR H3 (Avnir et al., 2014; Lang et al., 2017) . Recently, Joyce et al. reported three additional multidonor classes of bnAbs (Joyce et al., 2016) : (1) bnAbs that utilize V H 6-1 and D3-3 germline genes with a signature motif consisting of MIFGI in CDR H3; (2) bnAbs that utilize V H 1-18 and D3-9 with a signature motif of RXXILTG in CDR H3; and (3) bnAbs that utilize V H 1-18 and a QXXV motif in CDR H3. It is likely that more multidonor classes of bnAbs have yet to be discovered.
We describe here the structure of bnAb S9-3-37, which targets the HA stem mainly using a relatively long CDR H3 of 21 residues. Interestingly, CDR H3 of S9-3-37 has a very similar conformation as that of a pan-influenza A bnAb FI6v3 (Corti et al., 2011) , despite the huge disparity in their overall orientations with respect to the HA stem. Both S9-3-37 and FI6v3 utilize the D3-9 gene segment and encode an LXYFXWL motif in CDR H3 that is critical for targeting the most conserved part of the stem epitope. We further performed a structural comparison between S9-3-37 and 31.b.09 (Joyce et al., 2016) , which also utilizes the D3-9 gene segment. Our analysis reveals that the D3-9 gene segments of S9-3-37 and 31.b.09 are translated in different reading frames. Nonetheless, Phe in the D-3-9-encoded LXYFXWL motif of S9-3-37 and Leu in the D3-9-encoded ILTG motif of 31.b.09 targets the same pocket in the HA stem.
Moreover, both reading frames of the D3-9 gene segment are utilized by influenza bnAbs from multiple vaccinated individuals (Joyce et al., 2016) and can be found in the B cell repertoires of healthy donors (DeKosky et al., 2015) . Overall, this study reveals that the D3-9 gene segment offers a recurrent and robust strategy for targeting the HA stem in which at least two different binding poses can be utilized depending on the reading frame.
Structural Characterization of S9-3-37 S9-3-37 is a group 1-specific bnAb that was isolated from healthy volunteers who were vaccinated with the H5N1 prepandemic vaccine (Yamayoshi et al., 2018) . S9-3-37 uses the V H 1-18 and V k 2-24 heavy-and light-chain V genes, respectively ( Figure S1A -B), and binds to group 1 HAs with high affinity, but with no detectable binding to group 2 HAs (Table 1) . Here, we determined the crystal structure of S9-3-37 Fab in complex with the HA from an H5N1 strain A/Vietnam/1203/2004 (Viet04) to 2.9 Å resolution (Table S1; Figure S2 ). S9-3-7 binds to the HA stem region ( Figure 1A) , and one structural feature that immediately stands out is the extensive contact of CDR H3 with the stem region ( Figure 1B) . The overall conformation of CDR H3 of S9-3-37 at the HA-Fab interface is remarkably similar to that of CDR H3 of FI6v3 (Corti et al., 2011) . Moreover, the molecular interactions attributable to the D gene segment are almost identical between S9-3-37 and FI6v3. While S9-3-37 and FI6v3 utilize different V H and J gene segments, they both utilize the D3-9 gene segment and have a relatively long CDR H3 ( Figure S1C ). CDR H3 of FI6v3 is 20 residues, whereas that of S9-3-37 is 21 residues (Kabat numbering). Interestingly, despite the similarity in the conformation of CDR H3, the overall orientations of S9-3-37 and FI6v3 are very different ( Figure 1A ). When bound to HA, S9-3-37 adopts an $180 rotation relative to FI6v3, such that the positions of the heavy chain and light chain are swapped compared to FI6v3 with respect to their interaction with the HA stem epitope. To understand the difference in the HA binding orientation of S9-3-37 and FI6v3, structural alignments between their variable domains of heavy chains (V H ) were performed. When their D gene-encoded regions are superimposed, the base of their CDR H3s twist with respect to each other (Figure 1C) . When their V H framework regions are aligned, the CDR H3 loops of S9-3-37 and FI6v3 point in opposite directions (Figure 1D) . Together, our structural analysis reveals that the S9-3-37 and FI6v3 antibodies engage the HA stem surface in very different overall orientations and angles of approach, but nevertheless still have very similar CDR H3 conformations, especially for the residues that are important for engaging the conserved HA stem epitope.
Paratope Analysis Reveals the Importance of D3-9 in Binding Next, we compared the paratopes of S9-3-37 and FI6v3 and computed the buried surface areas (BSAs) in the S9-3-37-HA complex and FI6v3-HA complex (PDB 3TZN) (Corti et al., 2011) (see STAR Methods). The paratope of S9-3-37 is dominated by CDR H3, similar to FI6v3 (Figures 2A and 2B ) (Corti et al., 2011) , and only one framework residue interacts with the HA (V L Asp1 hydrogen bonds with HA1 Asn289 glycan; Figure S3A ). The total BSA on S9-3-37 is 910 Å 2 (868 Å 2 from the heavy chain and 42 Å 2 from the light chain), with 83% (756 Å 2 ) attributable to CDR H3. In comparison, the total BSA on FI6v3 is 903 Å 2 (714 Å 2 from the heavy chain and 189 Å 2 from the light chain), with 71% (638 Å 2 ) from CDR H3. Similarly, the BSA on the HA is comparable in the S9-3-37-HA complex (832 Å 2 ) and FI6v3-HA complex (817 Å 2 ). Thus, binding of S9-3-37 to the HA is dominated by the relatively long CDR H3, which is similar to that of FI6v3. In CDR H3 of S9-3-37, the D3-9 gene encodes for seven residues 100b LGYFDWL 100h (Figures 2C and S1C) . With the exception of Gly100c and Asp100f, the other five hydrophobic residues (Leu100b, Tyr100d, Phe100e, Trp100g, and Leu100h) all interact with the HA stem. Of note, all of these five contacting residues are encoded by the germline D3-9, and therefore, none arise from somatic mutations. The buried surface area of this D3-9-encoded region is 495 Å 2 ( Figure 2D ), which accounts for 54% of the total BSA of S9-3-37. This observation further highlights the importance of the D3-9 gene segment in binding to the HA stem.
Our structural analysis does, however, reveal a functional role for somatic mutations in CDR H2 of S9-3-37. Among the 17 residues in CDR H2, eight are somatically mutated from the inferred germline gene ( Figure S1A ). N54D and N56K in CDR H2 engage in electrostatic interactions with HA1 Lys32 and HA2 Asp57, respectively ( Figure S3B ). N56K, along with N58K, also hydrogen bonds with Gln99 in CDR H3 ( Figure S3C ). These observations suggest that somatic mutations in CDR H2 not only facilitate interactions with HA but also help stabilize the base of CDR H3. Therefore, somatic mutations in CDR H2 are likely to play a critical role during affinity maturation of S9-3-37.
Epitope Analysis Explains the Group 1 Specificity of S9-3-37 Many HA residues are targeted by both S9-3-37 and FI6v3 (Figure 3A) , namely HA1 residue 289, and HA2 residues 18, 19, 20, 21, 38, 41, 42, 45, 46, 48, 49, 52, 53, 56 , and 57 (H3 numbering is used throughout, unless indicated otherwise). In contrast, some HA residues (HA1 18, 38, 39, 40, 291, 292, and 318; and (A) Comparison between structures of S9-3-37 in complex with H5 HA and FI6v3 in complex with H3 HA (PDB 3ZTJ) (Corti et al., 2011) . HA1 is in wheat, HA2 in white, V H in pink, and V L in cyan. For clarity, only one Fab is shown per trimer, and the two HA protomers without their corresponding Fab are in gray.
(B) Comparison of the CDR H3 loops of S9-3-37 and FI6v3. The region corresponding to the D gene is in purple. Of note, despite occupying the same space, Leu100b in S9-3-37 is encoded by the D gene, whereas Leu100a in FI6v3 is encoded by N-region addition ( Figure S1 ). Similarly, Glu100i in S9-3-37 is encoded by N-region addition, whereas Ser100h in FI6v3 is encoded by the D gene.
(C and D) Structural comparison between S9-3-37 and FI6v3 reveals a twist in the base of CDR H3 with respect to each other. Only heavy-chain variable domains of S9-3-37 and FI6v3 are shown. Heavy-chain constant domains and light chains are not shown for clarity. Regions corresponding to CDR H3 are highlighted. Alignments are based on (C) D gene encoded-region or (D) the entire heavy-chain variable domain.
HA2 36, 50, and 54) are only targeted by S9-3-7, but not FI6v3, and vice versa (HA1 residues 8, 28, 29, 30, 287, 290, and 316; and HA2 residues 39 and 43) . Those shared epitope residues, which are targeted by CDR H3 ( Figure 3A ), are more conserved than those epitope residues that are unique to either S9-3-37 or FI6v3 ( Figures 3B and S4 ). In contrast to FI6v3, which bind to both groups 1 and 2 HAs, S9-3-37 only binds to group 1 HAs. Natural amino acid variants in HA1 residue 38 and HA2 residue (B) Sequence conservation of each epitope residue was quantified by sequence entropy, which was calculated based on the alignment of 20 representative strains from different types and subtypes shown in Figure S4 . ''Both,'' residues that are common between the epitopes of FI6v3 and S9-3-37. ''FI6v3,'' residues that are unique to the FI6v3 epitope. ''S9-3-37,'' residues that are unique to the S9-3-37 epitope. 50, both of which are epitope residues unique to S9-3-37, may explain the narrower breadth in S9-3-37 as compared to FI6v3. Most group 1 HAs carry an Asn at HA2 residue 50, whereas HA2 residue 50 is highly conserved as Gly among group 2 HAs ( Figure S3D ). In our crystal structure, HA2 Asn50 forms an H-bond with Arg97 of S9-3-37 V H ( Figure S3E ). When Gly is present at HA2 residue 50, this H-bond interaction would be abolished, which can partially explain lack of binding of S9-3-37 to group 2 HAs. In HA1 residue 38, an N-glycosylation site is present in most group 2 HAs, but not in group 1 HAs, and can provide steric hindrance to the HA stem epitope ( Figure S3F ). CDR L1 of S9-3-37 is spatially proximal to HA1 residue 38 ( Figure S3G ), and a glycan at HA1 residue 38 in group 2 HAs would likely clash with CDR L1 of S9-3-37. In comparison, although CDR H2 of FI6v3 occupies a similar space ( Figure S3G ), it is more distant from HA1 residue 38, which allows it to accommodate a glycan at HA1 residue 38.
In fact, the N-glycosylation site at HA1 Asn38 has limited the ability of several stem-binding bnAbs to neutralize group 2 influenza subtypes (Wu and Wilson, 2017) . Subsequently, these analyses show that while both S9-3-37 and FI6v3 utilize a long CDR H3 to target the most conserved part of the epitope, other epitope residues that are unique to each antibody are less conserved and play a critical role in determining their breadth. The different angles of approach may also be critical for the difference in breadth between S9-3-37 and FI6v3, as shown for V H 1-69-encoded bnAbs (Lang et al., 2017) . . Their study discovered a number of bnAbs, including one called 31.b.09, that utilized V H 1-18 and D3-9 (Joyce et al., 2016) , which is the same as in S9-3-37. Nonetheless, when we compared the sequence of S9-3-37 to that of 31.b.09, we noticed that their D3-9 gene segments were translated in different reading frames ( Figure 4A ). In other words, despite the usage of the same D gene segment, the corresponding amino acid sequences in the D gene-encoded regions of S9-3-37 and 31.b.09 are totally different. The D3-9 gene segment in S9-3-37 encodes for amino acids LGYFDWL, whereas the D3-9 gene segment in 31.b.09 encodes for amino acids ILTG. When we compared the structure of S9-3-37-HA complex with that of 31.b.09-HA complex (PDB 5K9O) (Joyce et al., 2016), we found that the Phe within the LGYFDWL motif in S9-3-37 and Leu within the ILTG motif in 31.b.09 occupy the same pocket in the HA stem region. This observation demonstrates that the D3-9 gene segment can engage the HA stem with two different reading frames. Unlike the two reading frames outlined above, the third reading frame of D3-9 gene segment contains a stop codon in the middle of the translated protein sequence, making it unlikely to be utilized ( Figure S5A ).
Five of the six subjects studied by Joyce et al. utilized the D3-9 gene segment in their cross-reactive B cells (Joyce et al., 2016) , with varying frequency ( Figure 4B ). Based on a motif searching approach (see STAR Methods), we classified the CDR H3 sequence of each B cell clone that utilized the D3-9 gene segment into S9-3-37-like or 31.b.09-like. Briefly, those D3-9-encoded CDR H3 sequences that contained an LXYFXWL motif, where X represents any amino acid, were classified as S9-3-37-like. Those D3-9-encoded CDR H3 sequences that contained an ILTG motif, with one mismatch allowed at position 1, 3, or 4 (i.e., no variation was allowed for Leu at the second position), were classified as 31.b.09-like. S9-3-37-like CDR H3 could be observed in four heterosubtypic cross-reactive B cells from subjects 31, 54, and 56, whereas 31.b.09-like CDR H3 could be observed in 78 heterosubtypic cross-reactive B cells from subjects 1, 16, and 31 ( Figures 4C and 4D ; Table S2 ). The four S9-3-37-like CDR H3 sequences identified in this analysis are relatively long (19-26 residues, Figure 4D ), as also seen in S9-3-37 (21 amino acid) and FI6v3 (20 amino acid), suggesting that they may bind to HA stem in a similar manner to S9-3-37 and FI6v3. Joyce et al. indeed showed that antibodies from three of those four B cell clones (31.d.01, 54.e.01, and 56.h.01) could neutralize multiple influenza subtypes (Joyce et al., 2016) . Of note, both 31.d.01 and 56.h.01 are group 1-specific bnAbs, whereas 54.e.01 can neutralize viruses from both group 1 and group 2.
In addition, we analyzed four additional datasets to examine the frequency of D3-9 usage in heterosubtypic cross-reactive B cells. These include 306 H1/H3/H7 cross-reactive B cells (Andrews et al., 2017) , 55 H1/H3 cross-reactive B cells (McCarthy et al., 2018) , 42 HA stem-specific B cells (Andrews et al., 2015) , and 198 HA stem-specific B cells (Corti et al., 2011; Pappas et al., 2014) and Pappas et al. (2014) . Overall, 16% (198 out of 1,246) heterosubtypic cross-reactive B cells utilized the D3-9 gene segment across all analyzed datasets. These findings demonstrate preferences in the use of the D3-9 gene segment in HA heterosubtypic cross-reactive B cells.
We further performed a structural comparison between S9-3-37 and a representative bnAb reported in Andrews et al. (2017) , namely 27-1C08 (V H 1-2+D3-9) that contains the LXYFXWL motif. Despite the vast difference in the overall conformations of the CDR H3 loops in 27-1C08 (PDB 5WCA) (Andrews et al., 2017) and S9-3-37, the conformations of their D3-9-encoded regions are highly similar ( Figure S6A ). While the structure of 27-1C08 in complex with HA is not available, our observations suggest that the D3-9-encoded regions from 27-1C08 would bind to the same stem pocket as S9-3-37. These results also support the notion that the D3-9 gene segment provides a recurring strategy to target the HA stem.
We further aimed to analyze the D3-9-encoded CDR H3s in the B cell repertoires of healthy donors. We utilized a dataset from DeKosky et al., which contained a total of 134,345 unique CDR H3 sequences from three healthy donors that were derived from next-generation sequencing data (DeKosky et al., 2015) . Both 31.b.09-like and S9-3-37-like CDR H3 sequences could be found ( Figure 5A ), although the frequency of 31.b.09-like CDR H3 sequences ($1%) was much higher than S9-3-37-like CDR H3 sequences ($0.01% to $0.1%). S9-3-37-like CDR H3s with at least 20 amino acids long could be observed in two of the three healthy donors ( Figure 5B ). This analysis shows that B cells that utilize S9-3-37-like CDR H3, despite being less common than 31.b.09-like CDR H3, are prevalent in the B cell repertoires of healthy donors.
To assess the relative prevalence of 31.b.09-like (ILTG motif) and S9-3-37-like (LXYFXWL motif) CDR H3s in targeting non-HA antigens, we analyzed two published datasets. The first dataset describes the memory B cell repertoires from four donors after administration of two meningococcal vaccines (Galson et al., 2015) . The second dataset describes the memory B cell repertoire from three time points that are 1-2 years apart in an HIV-infected patient (Huang et al., 2016) . As compared to the 1,246 HA cross-reactive antibodies that were analyzed above ( Figures 4B, 4C , and S5B-S5E), the S9-3-37-like CDR H3 is rare in the Figure 6 ). Among HA cross-reactive antibodies, the frequency of S9-3-37-like CDR H3 (2.2%, 27 out of 1,246) is $3-fold lower than that of 31.b.09-like CDR H3 (6.7%, 84 out of 1,246). In contrast, in the memory B cell repertoires of meningococcal-vaccinated individuals and the HIV-infected patient, an S9-3-37-like CDR H3 is either undetectable or >100-fold less prevalent than the 31.b.09-like CDR H3. This analysis suggests that, while 31.b.09-like CDR H3 may be involved at very low frequencies in targeting other non-HA antigens, S9-3-37-like CDR H3 is much more specific for targeting the HA stem region.
We also analyzed the germline usage of 53 HIV bnAbs from different clonotypes (Eroshkin et al., 2014) , in which two of them (PGT136 and VRC-PG04b) utilized the D3-9 gene segment ( Figure S7A ). Nonetheless, in these two HIV bnAbs, the D3-9 gene segment only encodes for two to three amino acids (Figures S7B and S7C) . Therefore, D3-9 gene segment in these two HIV bnAbs is unlikely to be as critical for engaging the epitope. Consistent with the analysis of memory B cell repertoires (Figure 6 ), the result here implies that the D3-9 gene segment has a specific role in bnAbs that target influenza HA, but not in other microbial antigens analyzed to date.
From the X-ray structure of group 1 influenza bnAb S9-3-37, we uncovered an important role of the D3-9 gene segment in targeting the HA stem region. In particular, the binding of the D3-9-encoded CDR H3 region to the HA stem is resilient to different reading (translation) frames and has minimum dependency on the V H and J H gene segments. Importantly, the D3-9 gene drews et al., 2015 Corti et al., 2011; McCarthy et al., 2018; Pappas et al., 2014) . Four subjects were analyzed in the Meningococcal vaccine dataset (Galson et al., 2015) . Three time points that were 1-2 years apart were analyzed in the HIV dataset (Huang et al., 2016) . The number of CDR H3 sequences being analyzed (n) in each sample is indicated. Data points are plotted on the x axis if their occurrence frequency equals zero. segment is utilized by HA stem-binding bnAbs from multiple donors (Andrews et al., 2017; Corti et al., 2011; Joyce et al., 2016; Wyrzucki et al., 2014 ).
An ''SOS component'' of the antibody repertoire has been used to describe a germline response that is immediately available upon microbial infection with minimum somatic mutations (Lerner, 2011 (Lerner, , 2016 . V H 1-69 is perhaps the most well-known ''SOS component'' of the antibody repertoire (Dreyfus et al., 2012; Ekiert et al., 2009; Kashyap et al., 2008; Lang et al., 2017; Lerner, 2011 Lerner, , 2016 Sui et al., 2009; Throsby et al., 2008) . Our results suggest that the D3-9 gene segment is another ''SOS component'' of the antibody repertoire. However, it seems the response from D3-9 gene segment is not as general as V H 1-69. V H 1-69 is frequently observed in immune response against pathogens other than influenza virus, such as hepatitis C virus (Marasca et al., 2001) , HIV (Gorny et al., 2012; Luftig et al., 2006) , and middle east respiratory syndrome coronavirus (Ying et al., 2015) . In contrast, our analysis indicates that D3-9 gene segment, especially the reading frame that encodes the motif LXYFXWL (S9-3-37-like CDR H3), is more specific to influenza virus (Figures 6 and S7). To date, most large-scale analyses on germline usage have focused on V H gene segments. Future analysis should investigate whether other D gene segments and even J H gene segments contribute to the ''SOS component'' of the antibody repertoire.
Structural characterization of multidonor class bnAbs has been highly valuable for development of the universal influenza vaccine using a reverse-engineering approach. For example, two different headless HA immunogens (Impagliazzo et al., 2015; Yassine et al., 2015) were designed based on the binding modes of CR9114 (Dreyfus et al., 2012) and CR6261 (Ekiert et al., 2009; Throsby et al., 2008) , which are both V H 1-69-encoded bnAbs. Structural characterization of such bnAbs also provides templates for antiviral development (Wu and Wilson, 2018) . These HA stem-binding bnAbs have also inspired the design of small proteins and peptides that display antiviral activities (Chevalier et al., 2017; Fleishman et al., 2011; Kadam et al., 2017; Whitehead et al., 2012) . In fact, a small neutralizing peptide that binds to the HA stem was designed primarily based on CDR H3 of bnAb FI6v3 containing the D3-9-encoded LXYFXWL motif (Kadam et al., 2017) (PDB 5W6T; Figure S6B ). We anticipate that future discovery and characterization of multidonor class bnAbs will continue to provide insight into antiviral and vaccine design against influenza virus.
As several HA stem-binding bnAbs are currently being tested in clinical trials for both prophylactic and therapeutic usage (Sparrow et al., 2016) , it is equally important to understand the potential for bnAb escape. Although it is generally more difficult for influenza virus to escape from stem antibodies compared to head antibodies (Anderson et al., 2017; Chai et al., 2016; Doud et al., 2018; Ekiert et al., 2009; Sui et al., 2009; Throsby et al., 2008; Yamayoshi et al., 2018) , strong escape mutations have been identified for several HA stem-binding bnAbs (Anderson et al., 2017; Chai et al., 2016; Ekiert et al., 2011; Friesen et al., 2014; Henry Dunand et al., 2015; Yamayoshi et al., 2018) . Most of the escape mutations dramatically decrease the binding affinity to bnAbs. Nonetheless, increasing the membrane fusion pH of the HA can also contribute to HA stem-binding bnAb escape (Chai et al., 2016) . A previous study by some of the authors here has shown that HA1 N21S/Y (residue 11 in H1 numbering), which eliminates the highly conserved N-glycosylation site at HA1 Asn21, abolished the neutralization activity of several HA stem-binding bnAbs, including S9-3-37 (Yamayoshi et al., 2018) . Interestingly, Asn21 is not part of the S9-3-37 epitope ( Figures S6C and S6D) , suggesting that the escape mechanism of N21S/Y is not due to a loss of a direct interaction that could decrease binding to S9-3-37. Rather, such S9-3-37 escape could be due to an increase in the membrane fusion pH, since removal of the N-glycosylation site at Asn21 has been suggested to decrease the pH stability of the virus and, hence, make it more fusion sensitive at higher pH (Yin et al., 2017) . While the HA stem is a promising target for antiviral and vaccine design, future studies of potential escape mechanisms will be critical for the development of a robust anti-influenza strategy.
Detailed methods are provided in the online version of this paper and include the following: and R.U. discovered and prepared the S9-3-37 antibody; N.C.W. performed the binding kinetic assays, X-ray data collection, structure determination, refinement, structural analysis, and sequence analysis; N.C.W. and I.A.W. wrote the paper, and all authors reviewed and edited the paper.
Y.K. has received speaker's honoraria from Toyama Chemical and grant support from Chugai Pharmaceuticals, Daiichi Sankyo Pharmaceutical, Toyama Chemical, Tauns Laboratories, Inc., Tsumura and Co, and Denka Seiken Co., Ltd. Y.K. is a co-founder of FluGen.
Corti, D., Voss, J., Gamblin, S.J., Codoni, G., Macagno, A., Jarrossay, D., Vachieri, S.G., Pinna, D., Minola, A., Vanzetta, F., et al. (2011) . A neutralizing antibody selected from plasma cells that binds to group 1 and group 2 influenza A hemagglutinins. Science 333, 850-856.
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Ian A. Wilson (wilson@scripps.edu).
ExpiCHO cells were maintained according to the manufacturer's instructions (Thermo Fisher Scientific). Sf9 cells and High Five cells were maintained in HyClone insect cell culture medium (GE Healthcare).
The heavy and light chains of S9-3-37 Fab were cloned into phCMV3 vector with an N-terminal secretion signal METDTLLLWVLLLWVPGSTG. The phCMV3-S9-3-37 heavy chain and phCMV3-S9-3-37 light chain were produced in the ExpiCHO expression system (Thermo Fisher Scientific) according to the Max Titer Protocol as described in the manufacturer's instructions. S9-3-37 Fab was purified by the 5 mL HiTrap Protein G HP antibody purification columns and buffer exchanged into 1x PBS.
Influenza hemagglutinin (HA) was prepared for binding studies as previously described (Ekiert et al., 2011) . Briefly, the HA ectodomain, which corresponds to 11-329 (HA1) and 1-176 (HA2) based on H3 numbering, was fused with an N-terminal gp67 signal peptide and a C-terminal BirA biotinylation site, thrombin cleavage site, trimerization domain, and a His 6 tag, and then cloned into a customized baculovirus transfer vector (Ekiert et al., 2011) . Recombinant bacmid DNA was generated using the Bac-to-Bac system (Life Technologies). Baculovirus was generated by transfecting purified bacmid DNA into Sf9 cells using FuGene HD (Promega the C-terminal tag (BirA biotinylation site, thrombin cleavage site, trimerization domain, and the His 6 tag) and to produce the cleaved mature HA (HA1/HA2). The trypsin-digested Viet04 HA was then purified by size exclusion chromatography on a Hiload 16/90 Superdex 200 column (GE Healthcare) in 20 mM Tris pH 8.0, 150 mM NaCl, and 0.02% NaN 3 .
The binding assay was performed by biolayer interferometry (BLI) using an Octet Red instrument (ForteBio, Menlo Park, CA). Biotinylated HA0 at $10-50 mg mL -1 in 1x kinetics buffer (1x PBS with 0.01% BSA and 0.002% Tween 20) was loaded onto streptavidin biosensors and incubated with supernatant from transfected cells or with the indicated concentration of Fab. Streptavidin biosensors that were not loaded were used as a reference for subtracting background binding from signals. Briefly, the assay consisted of five steps: 1) baseline: 60 s with 1x kinetics buffer; 2) loading: 120 s with biotinylated HA0; 3) baseline: 60 s with 1x kinetics buffer; 4) association: 120 s with samples (supernatant from transfected cells or purified Fab); and 5) dissociation: 120 s with 1x kinetics buffer. For estimating the K d , a 1:1 binding model was used.
Crystallization and structural determination S9-3-37 Fab was incubated with the purified Viet04 HA trimer in a molar ratio of 4.5:1 overnight at 4 C. The S9-3-37 Fab-Viet04 HA complex was purified by size exclusion chromatography on a Hiload 16/90 Superdex 200 column (GE Healthcare) in 20 mM Tris pH 8.0, 150 mM NaCl, and 0.02% NaN 3 and concentrated to 9 mg mL -1 in 10 mM Tris pH 8.0, 50 mM NaCl, and 0.02% NaN 3 . Crystal screening was carried out using our high-throughput, robotic CrystalMation system (Rigaku) at TSRI. The initial crystal screening was based on sitting drop vapor diffusion method with 35 mL reservoir solution and each drop consisting 0.1 mL protein + 0.1 mL precipitant. Diffraction-quality crystals were obtained with a reservoir solution containing 20% PEG 6000, 0.1 M HEPES pH 7.0 at 20 C. Diffraction data were collected at Stanford Synchrotron Radiation Lightsource beamline 12-2. The data were indexed, integrated and scaled using HKL2000 (HKL Research) (Otwinowski and Minor, 1997) . The structure was solved by molecular replacement using Phaser (McCoy et al., 2007) , modeled using Coot (Emsley et al., 2010) , and refined using Refmac5 (Murshudov et al., 2011) . For molecular replacement, PDB 2FK0 (Stevens et al., 2006) was used as the model for Viet04 HA, and a homology model generated by PIGSPro (Lepore et al., 2017) was used for S9-3-37. Ramachandran statistics were calculated using MolProbity (Chen et al., 2010) . IgBLAST (Ye et al., 2013) was employed to identify the CDRs on S9-3-37 Fab. The Kabat numbering scheme was used.
Buried surface area calculation Solvent accessibility was computed by DSSP (Kabsch and Sander, 1983) . Buried surface area (BSA) was calculated by subtracting the solvent accessibility of the bound form from that of the apo form. HA residues that had a non-zero BSA were identified as epitope residues.
Analysis of natural HA variants A total of 103,301 full-length HA protein sequences from different subtypes were downloaded from the Global Initiative for Sharing Avian Influenza Data (GISAID; https://gisaid.org). To avoid temporal sampling bias, we sampled at most 20 sequences per year per subtype, which resulted in a total of 6,984 HA sequences. Multiple sequence alignment of those 6,984 HA sequences was performed by MAFFT version 7.157b (Katoh and Standley, 2013) . Sequence logos were generated by WebLogo (Crooks et al., 2004) .
Sequence entropy was calculated by:
where P i is the fraction of residues of amino acid type i, and M is the number of amino acid types (i.e., 20). In Figure 2B , sequence entropy was calculated based on the alignment of 20 representative strains from different types and subtypes shown in Figure S4 .
Analysis of published antibody sequences CDR H3 sequences recovered from H3 and H5 cross-reactive memory B cells of six H5N1 DNA/MIV-prime-boost influenza vaccinated subjects were retrieved from Table S3 in Joyce et al. (Joyce et al., 2016) . Germline usage of H1/H3/H7 cross-reactive B cells from six H7N9 DNA/MIV-prime-boost influenza vaccine subjects was retrieved from Table S7 in Andrews et al. (Andrews et al., 2017) . Germline usage of H1 and H3 cross-reactive memory B cells from six subjects immunized with trivalent influenza vaccine was retrieved from Figure S4 in McCarthy et al. (McCarthy et al., 2018) . Germline usage of HA stem-specific memory B cells from subjects immunized with trivalent influenza vaccine was retrieved from Tables S3 and S4 in Andrews et al. (Andrews et al., 2015) . Germline usage of HA stem-specific memory B cells from a single subject immunized with seasonal influenza vaccine was retrieved from Figure S1 in Pappas et al. (Pappas et al., 2014) . CDR H3 sequences from memory B cells of three healthy donors and their occurrence frequency information were retrieved from Supplemental Dataset 1 in DeKosky et al. (DeKosky et al., 2015) . CDR H3 sequences from the IgG + memory B cell repertoires from four donors after administration of two Meningococcal vaccines (Galson et al., 2015) , and the IgG + memory B cell repertoire from three time points that were one to two years apart in an HIV-infected patient (Huang et al., 2016) , were downloaded from the Observed Antibody Space database (Kovaltsuk et al., 2018) . Nucleotide sequences of HIV bnAbs were retrieved from bNAber database (Eroshkin et al., 2014). Germline usages were identified using IgBlast (Ye et al., 2013) . S9-3-37-like CDR H3s were defined as those CDR H3s that utilized the D3-9 gene segment and encoded a LXYFXWL motif, where X represents any amino acid. 31.b.09-like CDR H3s were defined as those CDR H3s that utilized the D3-9 gene segment and encoded an ILTG motif with one mismatch in the non-underlined residue (i.e., no variation was allowed for the Leu at second position of the ILTG motif).
Statistical analysis was not performed in this study.
The X-ray coordinates and structure factors of S9-3-37 in complex with H5 HA have been deposited in the RCSB Protein Data Bank under accession code 6E3H. Custom python scripts for sequence analyses have been deposited to https://github.com/wchnicholas/ S9-3-37. (Andrews et al., 2017) , and potentially binds to the same epitope as S9-3-37. The D3-9-encoded regions of S9-3-37 and 27-1C08 are aligned. Despite the difference in CDR H3 conformation, the D3-9-encoded region in 27-1C08 is highly similar to that in S9-3-37. Regions corresponding to CDR H3 are highlighted. A zoom-in view displays the tip of the CDR H3, with side chains of interest shown in sticks representations. The apo structure of 27-1C08 is used here (PDB 5WCA) (Andrews et al., 2017) , since the structure of 27-1C08 in complex with HA was not determined. On the top, the germline line sequence of D3-9 is aligned with the CDR H3 D gene segment of S9-3-37 and 27-1C08. Somatic mutations are underlined. Nucleotides from N-regions are shown in lower case. (B) D3-9-encoded region in S9-3-37 is in blue. The designed small peptide P7 (PDB 5W6T) (Kadam et al., 2017) is in orange. All side chains in P7 are shown in stick representation. Side chains of the contacting residues in the D3-9-encoded region of S9-3-37 (LGYFDWL) are also shown in stick representation and are labelled. (C) HA1 Asn21 is an N-glycosylation site. HA1 Asn21 is colored in orange on structure of HA in complex with S9-3-37. HA1 is in dark gray, HA2 is in light gray, and S9-3-37 is in blue. (D) The location of HA1 Asn21 relative to the S9-3-37 epitope is shown. The S9-3-37 epitope is in yellow.
N-region 1 IGHD3-9*01
|