PMC:5118421 / 12033-28515 JSONTXT

Annnotations TAB JSON ListView MergeView

    TEST0

    {"project":"TEST0","denotations":[{"id":"27920779-108-113-3100221","span":{"begin":224,"end":225},"obj":"[\"8011289\"]"},{"id":"27920779-77-82-3100222","span":{"begin":305,"end":306},"obj":"[\"20805563\"]"},{"id":"27920779-80-86-3100223","span":{"begin":308,"end":310},"obj":"[\"23181829\", \"18677426\", \"18621685\", \"15968001\"]"},{"id":"27920779-193-198-3100224","span":{"begin":421,"end":422},"obj":"[\"20805563\"]"},{"id":"27920779-159-164-3100225","span":{"begin":584,"end":585},"obj":"[\"20805563\", \"25634361\", \"8011289\"]"},{"id":"27920779-164-170-3100226","span":{"begin":589,"end":591},"obj":"[\"7651532\"]"},{"id":"27920779-75-81-3100227","span":{"begin":3311,"end":3313},"obj":"[\"7651532\"]"},{"id":"27920779-144-150-3100228","span":{"begin":4384,"end":4386},"obj":"[\"16497590\"]"},{"id":"27920779-148-154-3100229","span":{"begin":4388,"end":4390},"obj":"[\"16623761\"]"},{"id":"27920779-90-96-3100230","span":{"begin":4791,"end":4793},"obj":"[\"8738915\"]"},{"id":"27920779-102-108-3100231","span":{"begin":4963,"end":4965},"obj":"[\"11859119\"]"},{"id":"27920779-106-112-3100232","span":{"begin":4967,"end":4969},"obj":"[\"12943801\"]"},{"id":"27920779-110-116-3100233","span":{"begin":4971,"end":4973},"obj":"[\"8786330\"]"},{"id":"27920779-114-120-3100234","span":{"begin":4975,"end":4977},"obj":"[\"1420357\"]"},{"id":"27920779-118-124-3100235","span":{"begin":4979,"end":4981},"obj":"[\"8943046\"]"},{"id":"27920779-124-130-3100236","span":{"begin":5296,"end":5298},"obj":"[\"7651532\"]"},{"id":"27920779-128-134-3100237","span":{"begin":5300,"end":5302},"obj":"[\"8738915\"]"},{"id":"27920779-107-113-3100238","span":{"begin":5917,"end":5919},"obj":"[\"8786330\"]"},{"id":"27920779-111-117-3100239","span":{"begin":5921,"end":5923},"obj":"[\"25646473\"]"},{"id":"27920779-118-124-3100240","span":{"begin":7265,"end":7267},"obj":"[\"15860590\", \"20882016\", \"23940276\"]"},{"id":"27920779-165-171-3100241","span":{"begin":7438,"end":7440},"obj":"[\"15529369\"]"},{"id":"27920779-57-63-3100242","span":{"begin":8212,"end":8214},"obj":"[\"21220454\"]"},{"id":"27920779-61-67-3100243","span":{"begin":8216,"end":8218},"obj":"[\"22615367\"]"},{"id":"27920779-42-48-3100244","span":{"begin":10239,"end":10241},"obj":"[\"21220454\", \"21220454\"]"},{"id":"27920779-36-42-3100245","span":{"begin":10280,"end":10282},"obj":"[\"22615367\", \"22615367\"]"},{"id":"27920779-183-189-3100246","span":{"begin":10839,"end":10841},"obj":"[\"11859119\"]"},{"id":"27920779-144-150-3100247","span":{"begin":11263,"end":11265},"obj":"[\"22407974\"]"},{"id":"27920779-42-48-3100244","span":{"begin":10239,"end":10241},"obj":"[\"21220454\", \"21220454\"]"},{"id":"27920779-36-42-3100245","span":{"begin":10280,"end":10282},"obj":"[\"22615367\", \"22615367\"]"},{"id":"27920779-56-62-3100248","span":{"begin":12524,"end":12526},"obj":"[\"22407974\"]"},{"id":"27920779-125-131-3100249","span":{"begin":13807,"end":13809},"obj":"[\"22407974\"]"},{"id":"27920779-62-68-3100250","span":{"begin":15207,"end":15209},"obj":"[\"11859119\"]"},{"id":"27920779-66-72-3100251","span":{"begin":15211,"end":15213},"obj":"[\"12943801\"]"},{"id":"27920779-70-76-3100252","span":{"begin":15215,"end":15217},"obj":"[\"8786330\"]"},{"id":"27920779-237-243-3100253","span":{"begin":15749,"end":15751},"obj":"[\"7916950\"]"},{"id":"27920779-86-92-3100254","span":{"begin":16371,"end":16373},"obj":"[\"22407974\"]"}],"text":"Results\n\nAGY Ser Codons, but Not TCN Ser Codons, Are Enriched in Germline-Encoded CDR Sequences of IgV-Region Genes\nIt is well established that CDR Arg residues play a major role in specifying the nuclear reactivity of ANA (3). Moreover, in spontaneous SLE, many ANA arise by SHM of non-autoreactive Abs (1, 28–31), and this is often associated with the conversion of CDR germline-encoded AGY Ser codons into Arg codons (1). At the same time, germline IgVH, Vκ, and Vλ genes have unusually high frequencies of AGY Ser codons in CDRs, and this tendency holds for both mice and humans (1–3, 17).\nIf AGY Ser codon abundance in Ab CDRs were merely due to a selection pressure to preserve Ser residues among germline-encoded V-region genes, we would expect equally high frequencies of four other serine codons (TCN). However, CDR TCN codon abundance, as defined by observed/expected ratios, was inconsistent across mouse and human VH, Vκ, and Vλ genes, reaching only 2.3-fold more than expected in the most extreme case (mouse Vκ) and less than expected in mouse and human VH genes and mouse Vλ genes (Figure 1A). Moreover, in most cases, TCN abundance was higher in FRs than in CDRs. In contrast, AGY codons were far more abundant in CDRs than expected and consistently much more so than in FRs (Figure 1A). To avoid a bias in our analyses, we took expected frequencies from codon usage tables for mouse and human genes rather than the random expected frequency of 0.016 (1/61) for a given codon. This is because the TCG codon includes the rare CpG dinucleotide, so using 0.016 would inflate the expected cumulative frequency of TCN codons, thereby reducing observed/expected ratios for TCN.\nFigure 1 High frequencies of AGY, but not TCN Ser codons among germline-encoded CDR sequences of IgV-region genes. (A) Ratio observed/expected for AGY and TCN Ser codons in human and mouse IgV-region genes. Germline CDRs and FRs were defined using the Kabat numbering system. Expected ratio was defined by frequencies of 52,926 mouse codons and 40,662,582 human codons at http://www.kazusa.or.jp/codon/. (B). Total numbers of AGY or TCN Ser codons per germline-encoded CDR sequences. Box plots were generated as indicated in Section “Materials and Methods.” Briefly, the center line indicates the median; box limits indicate the 25th and 75th percentiles; whiskers extend to minimum and maximum values, and crosses represent sample means. Notches represent the 95% confidence interval for each median. (C) Donut graphs represent the number of CDR1\u00262 AGY Ser codons minus the number of TCN Ser codons for a given gene. The gray, white, and black areas denote the number of IgV genes in which AGY Ser codon numbers are greater than, equal to, or less than TCN codon numbers respectively. Number of sequences indicated in center. p values were determined using a two-tailed paired t-test. ***p \u003c 0.0001. In addition to comparing observed/expected ratios for AGY and TCN codons, we also compared absolute numbers of these codons in mouse and human germline VH, Vκ, and Vλ genes. Despite a greater number of possible TCN codons, the bias favoring AGY Ser codons was still evident in all three major families of V genes for both species (Figures 1B,C). These abundance data are in agreement with data reported by Wagner et al. (17), showing that CDR AGY codons outnumber TCN codons at most CDR positions. Finally, the serine codon bias was not restricted to the idiosyncrasies of the Kabat CDR/FR definitions used in our analyses because it also applied to CDRs defined by the IMGT system (Figure S1 in Supplementary Material). Collectively, these results show that high frequencies of germline AGY serine codons in CDRs cannot be explained solely by a selection pressure favoring germline-encoded CDR serine residues.\n\nCDR AGY Codon Bias in Ig Genes Is the Product of an Evolutionary Selection Pressure\nThe frequent use of CDR AGY Ser codons among IgV-region genes from two different species (human and mouse) led us to speculate that this feature might be highly conserved in evolution. Thus, we analyzed IgVH gene sequences of cartilaginous fishes (class Chondrichthyes), which are descendants of the most ancient species with an adaptive immune system. The immune systems of species in this class share major features with those of mammals, including SHM, although not class switch recombination (32, 33). Our analysis of germline VH sequences from four Chondrichthyes species indicated that, as in mice and humans, AGY but not TCN Ser codons were enriched in germline-encoded CDR sequences (Figures S2A,B in Supplementary Material). Thus, the CDR AGY codon bias is a highly conserved feature of IgV-region genes. A similar trend was also observed in several other less distant species, by Jolly et al. (18).\n\nPreferential Use of AGY Triplets in the Ser Codon Reading Frame\nBecause the AGC triplet has been shown to be an intrinsically preferred target for AID-dependent SHM (13, 15, 16, 34, 35), it is plausible that high frequencies of CDR AGY codons resulted solely from an evolutionary pressure to ensure high somatic mutation frequencies in CDR sequences during immune responses. This would be consistent with the fact that αβTCR genes do not share the CDR AGY abundance and bias features with Ig genes (17, 18) (Figures S2C,D in Supplementary Material). If CDR AGY codons were preserved solely to enhance mutability, we would predict that AGY triplets would be equally frequent in all three reading frames. However, this was not the case. Even when only one AGY base was required to be contained within a CDR for inclusion in the non-coding CDR frame counts, AGY triplets in the Ser reading frame were nearly always more frequent than the combined frequencies of those in the two other reading frames (Figures 2A–C). This trend also held for AGC triplets contained within the context of the extremely mutable AGCT sequence (16, 36) (Figures S3A,B in Supplementary Material). Finally, the intrinsically mutable AGC triplet was consistently more frequent in the Ser reading frame than was the combined frequency for GCT triplets in all three reading frames (AGC on opposite strand), the only exception being the small mouse Vλ gene family (Figure S3C in Supplementary Material). These results argue that the abundance of germline CDR AGY codons was not solely due to an evolutionary selection pressure for high CDR mutability via SHM.\nFigure 2 Preferential use of the AGY triplets among CDR sequences in the Ser reading frame. (A) Schematic of how AGY triplets in the different reading frames were determined at CDR boundaries. AGY triplets at CDR boundaries were counted in non-coding frames if one or two bases were located in the CDR. (B) Numbers of in-frame Ser AGC codons compared to combined numbers of AGC triplets in two non-coding frames. (C) Same analysis as in (B) applied to AGT. Box plots and whiskers were defined in Figure 1 and in Section “Materials and Methods.”\n\nArginine Residues in Antiviral Ab Are Often Created by SHM of AGY Ser Codons\nAn abundance of CDR codons that are prone to mutate to encode antinuclear Ab seemed paradoxical. However, there is speculation that a modest degree of autoreactivity may be beneficial to antiviral immune responses (37–39). For example, some viruses display host-derived nuclear material on their capsids that might enhance B cell activation or antibody efficacy due to an avidity effect (40). Therefore, we sought to determine if Arg residues are frequently generated via SHM in antiviral Ab. At first, we examined somatic mutations in broadly neutralizing antibodies (bNAbs) against HIV. Although we found that somatic mutations in AGY codons frequently produced Arg codons in these Abs, the results were not easily interpreted because overall mutation frequencies were extremely high, and in many cases CDR boundaries could not be defined due to insertions and deletions. Therefore, we extended our analysis to 298 published sequences of human antibodies against eight other virus species or subspecies. This analysis revealed frequent somatic mutations converting AGY Ser codons in CDRs to Arg codons.\nIn two human studies involving the H1N1 influenza virus (23, 24), 17 out of 46 and 24 out of 49 antibodies had at least one AGY Ser to Arg amino acid replacement resulting from SHM (Figure 3A). Arg replacement mutations in CDR sequences accounted for 2.9 and 3.1% of all V-region gene missense mutations (CDRs and FRs) in the two studies, with replacements at germline AGY codons comprising most of these (2 and 2.23%). A similar trend was observed in antibodies against hepatitis A, B, and C, rhino, dengue, avian influenza, and West Nile viruses. CDR Arg mutations accounted for 2.4–9.4% of all missense mutations in V-region genes for these antibodies, most of which (1.5–6.6%) occurred at germline CDR AGY codons (Figure 3B; Table 1).\nFigure 3 Somatically generated Arg codons often arise at germline CDR AGY Ser codons in antiviral immune responses. (A) Sequences and analyses from two studies of anti-H1N1 antibodies, as described in Section “Materials and Methods.” Heavy and light chains for a particular clone were combined to generate data for the graphs. The data combine the results of CDR and FR analyses. Any → Arg indicates a mutation at any non-Arg codon that gives rise to an Arg codon. Ser → Arg indicates an AGY Ser codon to Arg codon mutation. Numbers inside graphs indicate number of clones that were analyzed (heavy plus light chain). (B) Bars represent the average number of indicated replacement mutations among antiviral antibodies (heavy or light chain genes). Influenza #1 (n = 92), Influenza #2 (n = 98), Rhinovirus (n = 12), Avian Influenza (n = 27), West Nile (n = 6), Dengue virus (n = 4), Hepatitis A, B and C (n = 59).\nTable 1 Amino acid replacements via SHM of CDR AGY Ser codons.\nImmunogen Asn (%) Gly (%) Thr (%) Arg (%) Others (%) #CDR AGY SHM\nInfluenzaa 22c 16c 19c 11 32 142\nInfluenzab 30c 12 23c 16 19 107\nWest Nile 20 0 60c 20 0 5\nDengue 14 0 43c 29 14 7\nRhinovirus 7 0 15 26 52 27\nAvian Influenza 50c 0 17 33 0 12\nHep. A, B, and C 22c 18 18 20 22 72\naAntibody sequences from Wrammert et al. (23).\nbAntibody sequences from Li et al. (24).\ncAmino acid replacements that occurred more often than Arg replacements at CDR AGY Ser codons.\n\nCDR AGY Codons Frequently Mutate to Produce Codons for Key Ag-Contact Residues in the Ab-Binding Site\nOur analyses of somatic mutations in antiviral Ab led to an unexpected finding: CDR AGY Ser codons frequently mutated to Asn, Thr, and Gly codons in addition to Arg codons. Most of these mutations occurred by single-base changes, predominantly at the central base in the AGY triplet (Table 2), which is the position that is preferentially targeted by AID (13). In many cases, mutations to these alternative codons, particularly those for Asn and Thr, were more frequent than to Arg codons. For example, in anti-influenza Abs, CDR AGY mutations to Asn and Thr codons were each approximately twice as frequent as mutations to Arg codons. These observations were particularly revealing because in their analyses of numerous crystal structures of Ab–Ag complexes, Raghunathan et al. (19) identified Asn, Thr, Arg, Gly, Ser, Asp, and Tyr as key (i.e., most frequent) Ag-contact residues.\nTable 2 Base distribution of somatic mutations in CDR AGY Ser codons.\nImmunogen AGY (%) AGY (%) AGY (%) 2 changes (%) 3 changes (%)\nInfluenzaa 12 53 11 20 4\nInfluenzab 11 52 15 20 2\nWest Nile 0 80 20 0 0\nDengue 15 57 14 14 0\nRhinovirus 0 22 19 52 7\nAvian Influenza 0 67 33 0 0\nHep. A, B, and C 12 35 18 35 0\naAntibody sequences from Wrammert et al. (23).\nbAntibody sequences from Li et al. (24). In the report by Raghunathan and colleagues, it was not clear which contact residues were generated by SHM. To determine if residues frequently generated by SHM of AGY Ser codons are associated with Ab affinity maturation, we analyzed 72 (46 mouse and 26 human) Ab–Ag crystal structures available in the RCSB protein data bank (pdb) database, identified predicted Ag-contact residues, and searched IgBLAST to distinguish those that were germline-encoded from those that were somatically generated. When mouse and human data where combined, the seven most frequent Ag-contact residues were Arg, Asp, Asn, Gly, Ser, Thr and Tyr (Figure S4 in Supplementary Material). This result is identical to that of Raghunathan et al. (19), even though only 4 of the 72 structures we analyzed were also analyzed by them. Yet, we found that only three (Asn, Ser, and Tyr) of those seven residues (Arg, Asn, Asp, Gly, Ser, Thr, and Tyr) were present at higher frequencies than expected within CDRs of mouse and human germline IgV-region genes (Figure 4A). Importantly, amino acids resulting from SHM accounted for only 10–23% (average 14.7%) of all Ag-contact residues (Table 3 footnotes; Figure S4 in Supplementary Material). This is relevant to our conclusion regarding AGY versatility because it means that the seven key Ag-contact residues were largely defined by germline-encoded contacts; yet four (Asn, Arg, Gly, and Thr) of the seven most abundant contact residues arise frequently from somatic mutations at CDR AGY codons.\nFigure 4 CDR AGY Ser codons play a key role in affinity maturation. (A) Ratio observed over expected for synonymous codons in CDR sequences of combined IgV genes (VH, Vκ, and Vλ). (B) Percentage of the total contact residues that were created by SHM in V-region sequences only. Each data set represents a germline-encoded codon given rise to any contact residue. Black bars represent the percentage of AGY Ser codons that gave rise to a key contact residue defined by Raghunathan et al. (19).\nTable 3 Amino acid replacements due to somatic mutation of germline AGY Ser codons.a\nContact mutations at AGYb % of all contact mutationsc\nHuman Mouse Human (%) Mouse (%)\nArg 4 7 5.55 6.73\nAsn 5 9 6.94 8.65\nGly 0 1 0 0.96\nThr 6 5 8.33 4.81\nOthers 15 8 20.83 7.69\naData from 26 human and 46 mouse crystal structures of Ag–Ab complexes.\nbV-region contact residues arising from SHM of AGY Ser codons. Numbers expressed in absolute numbers. Total contact residues analyzed were 317 (human) and 886 (mouse). Total contact residues that were associated with SHM of a V-region codon were 72 (human) and 104 (mouse).\ncPercentage of total somatically generated contacts residues that arose from mutation of AGY Ser codons. For somatically generated contact residues, mutations at AGY Ser codons were the most abundant by far, and occurred ~2–3 times more often than mutations at AAY Asn codons (Figure 4B), the second most consistently mutated codon group. Most importantly, AGY Ser codons mutated to contact residues more often than any other codon group (Figure 4B), and a large proportion of these (~70%) were those defined as key Ag-contact residues. AGY mutations to codons for Arg, Asn, and Thr were the most consistent, and this was true for both contact and non-contact residues (Table 3 and data not shown). AAY triplets are also intrinsically preferred targets of SHM (13, 15, 16). However, when considering the potential to mutate to 1 of the 6 non-synonymous key contact residues (Arg, Asn, Asp, Gly, Ser, Thr, and Tyr), AGY Ser codons are able to do so via 12 out of 18 possible single-base changes. For AAY (Asn), this occurs with 8 out of 18 base changes, and for TCN, it occurs with only 6 out of 36 base substitutions (Figure 5), a result that is in agreement with the observation by Chang and Casali that CDR, but not FR sequences, are prone to acquire replacement mutations upon random point mutation (41). Collectively, the results of these analyses indicate that AGY codons contribute to Ab affinity both directly, by encoding a Ser residue, and indirectly due to the ease with which they mutate to encode other residues beneficial to the process of Ab affinity maturation. We believe this is the most straightforward explanation for the conservation of AGY codon abundance in CDRs of germline IgV-region genes.\nFigure 5 AGY Ser codons plasticity. Probability of creating a key non-synonymous contact residue by one nucleotide change. Filled gray boxes indicate a key Ag-contact residue as defined by Raghunathan et al. (19). White boxes indicate a synonymous change, a non-key contact residue (defined in the text) or a stop codon."}

    2_test

    {"project":"2_test","denotations":[{"id":"27920779-8011289-34707944","span":{"begin":224,"end":225},"obj":"8011289"},{"id":"27920779-20805563-34707945","span":{"begin":305,"end":306},"obj":"20805563"},{"id":"27920779-23181829-34707946","span":{"begin":308,"end":310},"obj":"23181829"},{"id":"27920779-18677426-34707946","span":{"begin":308,"end":310},"obj":"18677426"},{"id":"27920779-18621685-34707946","span":{"begin":308,"end":310},"obj":"18621685"},{"id":"27920779-15968001-34707946","span":{"begin":308,"end":310},"obj":"15968001"},{"id":"27920779-20805563-34707947","span":{"begin":421,"end":422},"obj":"20805563"},{"id":"27920779-20805563-34707948","span":{"begin":584,"end":585},"obj":"20805563"},{"id":"27920779-25634361-34707948","span":{"begin":584,"end":585},"obj":"25634361"},{"id":"27920779-8011289-34707948","span":{"begin":584,"end":585},"obj":"8011289"},{"id":"27920779-7651532-34707949","span":{"begin":589,"end":591},"obj":"7651532"},{"id":"27920779-7651532-34707950","span":{"begin":3311,"end":3313},"obj":"7651532"},{"id":"27920779-16497590-34707951","span":{"begin":4384,"end":4386},"obj":"16497590"},{"id":"27920779-16623761-34707952","span":{"begin":4388,"end":4390},"obj":"16623761"},{"id":"27920779-8738915-34707953","span":{"begin":4791,"end":4793},"obj":"8738915"},{"id":"27920779-11859119-34707954","span":{"begin":4963,"end":4965},"obj":"11859119"},{"id":"27920779-12943801-34707955","span":{"begin":4967,"end":4969},"obj":"12943801"},{"id":"27920779-8786330-34707956","span":{"begin":4971,"end":4973},"obj":"8786330"},{"id":"27920779-1420357-34707957","span":{"begin":4975,"end":4977},"obj":"1420357"},{"id":"27920779-8943046-34707958","span":{"begin":4979,"end":4981},"obj":"8943046"},{"id":"27920779-7651532-34707959","span":{"begin":5296,"end":5298},"obj":"7651532"},{"id":"27920779-8738915-34707960","span":{"begin":5300,"end":5302},"obj":"8738915"},{"id":"27920779-8786330-34707961","span":{"begin":5917,"end":5919},"obj":"8786330"},{"id":"27920779-25646473-34707962","span":{"begin":5921,"end":5923},"obj":"25646473"},{"id":"27920779-15860590-34707963","span":{"begin":7265,"end":7267},"obj":"15860590"},{"id":"27920779-20882016-34707963","span":{"begin":7265,"end":7267},"obj":"20882016"},{"id":"27920779-23940276-34707963","span":{"begin":7265,"end":7267},"obj":"23940276"},{"id":"27920779-15529369-34707964","span":{"begin":7438,"end":7440},"obj":"15529369"},{"id":"27920779-21220454-34707965","span":{"begin":8212,"end":8214},"obj":"21220454"},{"id":"27920779-22615367-34707966","span":{"begin":8216,"end":8218},"obj":"22615367"},{"id":"27920779-21220454-34707967","span":{"begin":10239,"end":10241},"obj":"21220454"},{"id":"27920779-22615367-34707968","span":{"begin":10280,"end":10282},"obj":"22615367"},{"id":"27920779-11859119-34707969","span":{"begin":10839,"end":10841},"obj":"11859119"},{"id":"27920779-22407974-34707970","span":{"begin":11263,"end":11265},"obj":"22407974"},{"id":"27920779-21220454-34707967","span":{"begin":10239,"end":10241},"obj":"21220454"},{"id":"27920779-22615367-34707968","span":{"begin":10280,"end":10282},"obj":"22615367"},{"id":"27920779-22407974-34707971","span":{"begin":12524,"end":12526},"obj":"22407974"},{"id":"27920779-22407974-34707972","span":{"begin":13807,"end":13809},"obj":"22407974"},{"id":"27920779-11859119-34707973","span":{"begin":15207,"end":15209},"obj":"11859119"},{"id":"27920779-12943801-34707974","span":{"begin":15211,"end":15213},"obj":"12943801"},{"id":"27920779-8786330-34707975","span":{"begin":15215,"end":15217},"obj":"8786330"},{"id":"27920779-7916950-34707976","span":{"begin":15749,"end":15751},"obj":"7916950"},{"id":"27920779-22407974-34707977","span":{"begin":16371,"end":16373},"obj":"22407974"}],"text":"Results\n\nAGY Ser Codons, but Not TCN Ser Codons, Are Enriched in Germline-Encoded CDR Sequences of IgV-Region Genes\nIt is well established that CDR Arg residues play a major role in specifying the nuclear reactivity of ANA (3). Moreover, in spontaneous SLE, many ANA arise by SHM of non-autoreactive Abs (1, 28–31), and this is often associated with the conversion of CDR germline-encoded AGY Ser codons into Arg codons (1). At the same time, germline IgVH, Vκ, and Vλ genes have unusually high frequencies of AGY Ser codons in CDRs, and this tendency holds for both mice and humans (1–3, 17).\nIf AGY Ser codon abundance in Ab CDRs were merely due to a selection pressure to preserve Ser residues among germline-encoded V-region genes, we would expect equally high frequencies of four other serine codons (TCN). However, CDR TCN codon abundance, as defined by observed/expected ratios, was inconsistent across mouse and human VH, Vκ, and Vλ genes, reaching only 2.3-fold more than expected in the most extreme case (mouse Vκ) and less than expected in mouse and human VH genes and mouse Vλ genes (Figure 1A). Moreover, in most cases, TCN abundance was higher in FRs than in CDRs. In contrast, AGY codons were far more abundant in CDRs than expected and consistently much more so than in FRs (Figure 1A). To avoid a bias in our analyses, we took expected frequencies from codon usage tables for mouse and human genes rather than the random expected frequency of 0.016 (1/61) for a given codon. This is because the TCG codon includes the rare CpG dinucleotide, so using 0.016 would inflate the expected cumulative frequency of TCN codons, thereby reducing observed/expected ratios for TCN.\nFigure 1 High frequencies of AGY, but not TCN Ser codons among germline-encoded CDR sequences of IgV-region genes. (A) Ratio observed/expected for AGY and TCN Ser codons in human and mouse IgV-region genes. Germline CDRs and FRs were defined using the Kabat numbering system. Expected ratio was defined by frequencies of 52,926 mouse codons and 40,662,582 human codons at http://www.kazusa.or.jp/codon/. (B). Total numbers of AGY or TCN Ser codons per germline-encoded CDR sequences. Box plots were generated as indicated in Section “Materials and Methods.” Briefly, the center line indicates the median; box limits indicate the 25th and 75th percentiles; whiskers extend to minimum and maximum values, and crosses represent sample means. Notches represent the 95% confidence interval for each median. (C) Donut graphs represent the number of CDR1\u00262 AGY Ser codons minus the number of TCN Ser codons for a given gene. The gray, white, and black areas denote the number of IgV genes in which AGY Ser codon numbers are greater than, equal to, or less than TCN codon numbers respectively. Number of sequences indicated in center. p values were determined using a two-tailed paired t-test. ***p \u003c 0.0001. In addition to comparing observed/expected ratios for AGY and TCN codons, we also compared absolute numbers of these codons in mouse and human germline VH, Vκ, and Vλ genes. Despite a greater number of possible TCN codons, the bias favoring AGY Ser codons was still evident in all three major families of V genes for both species (Figures 1B,C). These abundance data are in agreement with data reported by Wagner et al. (17), showing that CDR AGY codons outnumber TCN codons at most CDR positions. Finally, the serine codon bias was not restricted to the idiosyncrasies of the Kabat CDR/FR definitions used in our analyses because it also applied to CDRs defined by the IMGT system (Figure S1 in Supplementary Material). Collectively, these results show that high frequencies of germline AGY serine codons in CDRs cannot be explained solely by a selection pressure favoring germline-encoded CDR serine residues.\n\nCDR AGY Codon Bias in Ig Genes Is the Product of an Evolutionary Selection Pressure\nThe frequent use of CDR AGY Ser codons among IgV-region genes from two different species (human and mouse) led us to speculate that this feature might be highly conserved in evolution. Thus, we analyzed IgVH gene sequences of cartilaginous fishes (class Chondrichthyes), which are descendants of the most ancient species with an adaptive immune system. The immune systems of species in this class share major features with those of mammals, including SHM, although not class switch recombination (32, 33). Our analysis of germline VH sequences from four Chondrichthyes species indicated that, as in mice and humans, AGY but not TCN Ser codons were enriched in germline-encoded CDR sequences (Figures S2A,B in Supplementary Material). Thus, the CDR AGY codon bias is a highly conserved feature of IgV-region genes. A similar trend was also observed in several other less distant species, by Jolly et al. (18).\n\nPreferential Use of AGY Triplets in the Ser Codon Reading Frame\nBecause the AGC triplet has been shown to be an intrinsically preferred target for AID-dependent SHM (13, 15, 16, 34, 35), it is plausible that high frequencies of CDR AGY codons resulted solely from an evolutionary pressure to ensure high somatic mutation frequencies in CDR sequences during immune responses. This would be consistent with the fact that αβTCR genes do not share the CDR AGY abundance and bias features with Ig genes (17, 18) (Figures S2C,D in Supplementary Material). If CDR AGY codons were preserved solely to enhance mutability, we would predict that AGY triplets would be equally frequent in all three reading frames. However, this was not the case. Even when only one AGY base was required to be contained within a CDR for inclusion in the non-coding CDR frame counts, AGY triplets in the Ser reading frame were nearly always more frequent than the combined frequencies of those in the two other reading frames (Figures 2A–C). This trend also held for AGC triplets contained within the context of the extremely mutable AGCT sequence (16, 36) (Figures S3A,B in Supplementary Material). Finally, the intrinsically mutable AGC triplet was consistently more frequent in the Ser reading frame than was the combined frequency for GCT triplets in all three reading frames (AGC on opposite strand), the only exception being the small mouse Vλ gene family (Figure S3C in Supplementary Material). These results argue that the abundance of germline CDR AGY codons was not solely due to an evolutionary selection pressure for high CDR mutability via SHM.\nFigure 2 Preferential use of the AGY triplets among CDR sequences in the Ser reading frame. (A) Schematic of how AGY triplets in the different reading frames were determined at CDR boundaries. AGY triplets at CDR boundaries were counted in non-coding frames if one or two bases were located in the CDR. (B) Numbers of in-frame Ser AGC codons compared to combined numbers of AGC triplets in two non-coding frames. (C) Same analysis as in (B) applied to AGT. Box plots and whiskers were defined in Figure 1 and in Section “Materials and Methods.”\n\nArginine Residues in Antiviral Ab Are Often Created by SHM of AGY Ser Codons\nAn abundance of CDR codons that are prone to mutate to encode antinuclear Ab seemed paradoxical. However, there is speculation that a modest degree of autoreactivity may be beneficial to antiviral immune responses (37–39). For example, some viruses display host-derived nuclear material on their capsids that might enhance B cell activation or antibody efficacy due to an avidity effect (40). Therefore, we sought to determine if Arg residues are frequently generated via SHM in antiviral Ab. At first, we examined somatic mutations in broadly neutralizing antibodies (bNAbs) against HIV. Although we found that somatic mutations in AGY codons frequently produced Arg codons in these Abs, the results were not easily interpreted because overall mutation frequencies were extremely high, and in many cases CDR boundaries could not be defined due to insertions and deletions. Therefore, we extended our analysis to 298 published sequences of human antibodies against eight other virus species or subspecies. This analysis revealed frequent somatic mutations converting AGY Ser codons in CDRs to Arg codons.\nIn two human studies involving the H1N1 influenza virus (23, 24), 17 out of 46 and 24 out of 49 antibodies had at least one AGY Ser to Arg amino acid replacement resulting from SHM (Figure 3A). Arg replacement mutations in CDR sequences accounted for 2.9 and 3.1% of all V-region gene missense mutations (CDRs and FRs) in the two studies, with replacements at germline AGY codons comprising most of these (2 and 2.23%). A similar trend was observed in antibodies against hepatitis A, B, and C, rhino, dengue, avian influenza, and West Nile viruses. CDR Arg mutations accounted for 2.4–9.4% of all missense mutations in V-region genes for these antibodies, most of which (1.5–6.6%) occurred at germline CDR AGY codons (Figure 3B; Table 1).\nFigure 3 Somatically generated Arg codons often arise at germline CDR AGY Ser codons in antiviral immune responses. (A) Sequences and analyses from two studies of anti-H1N1 antibodies, as described in Section “Materials and Methods.” Heavy and light chains for a particular clone were combined to generate data for the graphs. The data combine the results of CDR and FR analyses. Any → Arg indicates a mutation at any non-Arg codon that gives rise to an Arg codon. Ser → Arg indicates an AGY Ser codon to Arg codon mutation. Numbers inside graphs indicate number of clones that were analyzed (heavy plus light chain). (B) Bars represent the average number of indicated replacement mutations among antiviral antibodies (heavy or light chain genes). Influenza #1 (n = 92), Influenza #2 (n = 98), Rhinovirus (n = 12), Avian Influenza (n = 27), West Nile (n = 6), Dengue virus (n = 4), Hepatitis A, B and C (n = 59).\nTable 1 Amino acid replacements via SHM of CDR AGY Ser codons.\nImmunogen Asn (%) Gly (%) Thr (%) Arg (%) Others (%) #CDR AGY SHM\nInfluenzaa 22c 16c 19c 11 32 142\nInfluenzab 30c 12 23c 16 19 107\nWest Nile 20 0 60c 20 0 5\nDengue 14 0 43c 29 14 7\nRhinovirus 7 0 15 26 52 27\nAvian Influenza 50c 0 17 33 0 12\nHep. A, B, and C 22c 18 18 20 22 72\naAntibody sequences from Wrammert et al. (23).\nbAntibody sequences from Li et al. (24).\ncAmino acid replacements that occurred more often than Arg replacements at CDR AGY Ser codons.\n\nCDR AGY Codons Frequently Mutate to Produce Codons for Key Ag-Contact Residues in the Ab-Binding Site\nOur analyses of somatic mutations in antiviral Ab led to an unexpected finding: CDR AGY Ser codons frequently mutated to Asn, Thr, and Gly codons in addition to Arg codons. Most of these mutations occurred by single-base changes, predominantly at the central base in the AGY triplet (Table 2), which is the position that is preferentially targeted by AID (13). In many cases, mutations to these alternative codons, particularly those for Asn and Thr, were more frequent than to Arg codons. For example, in anti-influenza Abs, CDR AGY mutations to Asn and Thr codons were each approximately twice as frequent as mutations to Arg codons. These observations were particularly revealing because in their analyses of numerous crystal structures of Ab–Ag complexes, Raghunathan et al. (19) identified Asn, Thr, Arg, Gly, Ser, Asp, and Tyr as key (i.e., most frequent) Ag-contact residues.\nTable 2 Base distribution of somatic mutations in CDR AGY Ser codons.\nImmunogen AGY (%) AGY (%) AGY (%) 2 changes (%) 3 changes (%)\nInfluenzaa 12 53 11 20 4\nInfluenzab 11 52 15 20 2\nWest Nile 0 80 20 0 0\nDengue 15 57 14 14 0\nRhinovirus 0 22 19 52 7\nAvian Influenza 0 67 33 0 0\nHep. A, B, and C 12 35 18 35 0\naAntibody sequences from Wrammert et al. (23).\nbAntibody sequences from Li et al. (24). In the report by Raghunathan and colleagues, it was not clear which contact residues were generated by SHM. To determine if residues frequently generated by SHM of AGY Ser codons are associated with Ab affinity maturation, we analyzed 72 (46 mouse and 26 human) Ab–Ag crystal structures available in the RCSB protein data bank (pdb) database, identified predicted Ag-contact residues, and searched IgBLAST to distinguish those that were germline-encoded from those that were somatically generated. When mouse and human data where combined, the seven most frequent Ag-contact residues were Arg, Asp, Asn, Gly, Ser, Thr and Tyr (Figure S4 in Supplementary Material). This result is identical to that of Raghunathan et al. (19), even though only 4 of the 72 structures we analyzed were also analyzed by them. Yet, we found that only three (Asn, Ser, and Tyr) of those seven residues (Arg, Asn, Asp, Gly, Ser, Thr, and Tyr) were present at higher frequencies than expected within CDRs of mouse and human germline IgV-region genes (Figure 4A). Importantly, amino acids resulting from SHM accounted for only 10–23% (average 14.7%) of all Ag-contact residues (Table 3 footnotes; Figure S4 in Supplementary Material). This is relevant to our conclusion regarding AGY versatility because it means that the seven key Ag-contact residues were largely defined by germline-encoded contacts; yet four (Asn, Arg, Gly, and Thr) of the seven most abundant contact residues arise frequently from somatic mutations at CDR AGY codons.\nFigure 4 CDR AGY Ser codons play a key role in affinity maturation. (A) Ratio observed over expected for synonymous codons in CDR sequences of combined IgV genes (VH, Vκ, and Vλ). (B) Percentage of the total contact residues that were created by SHM in V-region sequences only. Each data set represents a germline-encoded codon given rise to any contact residue. Black bars represent the percentage of AGY Ser codons that gave rise to a key contact residue defined by Raghunathan et al. (19).\nTable 3 Amino acid replacements due to somatic mutation of germline AGY Ser codons.a\nContact mutations at AGYb % of all contact mutationsc\nHuman Mouse Human (%) Mouse (%)\nArg 4 7 5.55 6.73\nAsn 5 9 6.94 8.65\nGly 0 1 0 0.96\nThr 6 5 8.33 4.81\nOthers 15 8 20.83 7.69\naData from 26 human and 46 mouse crystal structures of Ag–Ab complexes.\nbV-region contact residues arising from SHM of AGY Ser codons. Numbers expressed in absolute numbers. Total contact residues analyzed were 317 (human) and 886 (mouse). Total contact residues that were associated with SHM of a V-region codon were 72 (human) and 104 (mouse).\ncPercentage of total somatically generated contacts residues that arose from mutation of AGY Ser codons. For somatically generated contact residues, mutations at AGY Ser codons were the most abundant by far, and occurred ~2–3 times more often than mutations at AAY Asn codons (Figure 4B), the second most consistently mutated codon group. Most importantly, AGY Ser codons mutated to contact residues more often than any other codon group (Figure 4B), and a large proportion of these (~70%) were those defined as key Ag-contact residues. AGY mutations to codons for Arg, Asn, and Thr were the most consistent, and this was true for both contact and non-contact residues (Table 3 and data not shown). AAY triplets are also intrinsically preferred targets of SHM (13, 15, 16). However, when considering the potential to mutate to 1 of the 6 non-synonymous key contact residues (Arg, Asn, Asp, Gly, Ser, Thr, and Tyr), AGY Ser codons are able to do so via 12 out of 18 possible single-base changes. For AAY (Asn), this occurs with 8 out of 18 base changes, and for TCN, it occurs with only 6 out of 36 base substitutions (Figure 5), a result that is in agreement with the observation by Chang and Casali that CDR, but not FR sequences, are prone to acquire replacement mutations upon random point mutation (41). Collectively, the results of these analyses indicate that AGY codons contribute to Ab affinity both directly, by encoding a Ser residue, and indirectly due to the ease with which they mutate to encode other residues beneficial to the process of Ab affinity maturation. We believe this is the most straightforward explanation for the conservation of AGY codon abundance in CDRs of germline IgV-region genes.\nFigure 5 AGY Ser codons plasticity. Probability of creating a key non-synonymous contact residue by one nucleotide change. Filled gray boxes indicate a key Ag-contact residue as defined by Raghunathan et al. (19). White boxes indicate a synonymous change, a non-key contact residue (defined in the text) or a stop codon."}

    MyTest

    {"project":"MyTest","denotations":[{"id":"27920779-8011289-34707944","span":{"begin":224,"end":225},"obj":"8011289"},{"id":"27920779-20805563-34707945","span":{"begin":305,"end":306},"obj":"20805563"},{"id":"27920779-23181829-34707946","span":{"begin":308,"end":310},"obj":"23181829"},{"id":"27920779-18677426-34707946","span":{"begin":308,"end":310},"obj":"18677426"},{"id":"27920779-18621685-34707946","span":{"begin":308,"end":310},"obj":"18621685"},{"id":"27920779-15968001-34707946","span":{"begin":308,"end":310},"obj":"15968001"},{"id":"27920779-20805563-34707947","span":{"begin":421,"end":422},"obj":"20805563"},{"id":"27920779-20805563-34707948","span":{"begin":584,"end":585},"obj":"20805563"},{"id":"27920779-25634361-34707948","span":{"begin":584,"end":585},"obj":"25634361"},{"id":"27920779-8011289-34707948","span":{"begin":584,"end":585},"obj":"8011289"},{"id":"27920779-7651532-34707949","span":{"begin":589,"end":591},"obj":"7651532"},{"id":"27920779-7651532-34707950","span":{"begin":3311,"end":3313},"obj":"7651532"},{"id":"27920779-16497590-34707951","span":{"begin":4384,"end":4386},"obj":"16497590"},{"id":"27920779-16623761-34707952","span":{"begin":4388,"end":4390},"obj":"16623761"},{"id":"27920779-8738915-34707953","span":{"begin":4791,"end":4793},"obj":"8738915"},{"id":"27920779-11859119-34707954","span":{"begin":4963,"end":4965},"obj":"11859119"},{"id":"27920779-12943801-34707955","span":{"begin":4967,"end":4969},"obj":"12943801"},{"id":"27920779-8786330-34707956","span":{"begin":4971,"end":4973},"obj":"8786330"},{"id":"27920779-1420357-34707957","span":{"begin":4975,"end":4977},"obj":"1420357"},{"id":"27920779-8943046-34707958","span":{"begin":4979,"end":4981},"obj":"8943046"},{"id":"27920779-7651532-34707959","span":{"begin":5296,"end":5298},"obj":"7651532"},{"id":"27920779-8738915-34707960","span":{"begin":5300,"end":5302},"obj":"8738915"},{"id":"27920779-8786330-34707961","span":{"begin":5917,"end":5919},"obj":"8786330"},{"id":"27920779-25646473-34707962","span":{"begin":5921,"end":5923},"obj":"25646473"},{"id":"27920779-15860590-34707963","span":{"begin":7265,"end":7267},"obj":"15860590"},{"id":"27920779-20882016-34707963","span":{"begin":7265,"end":7267},"obj":"20882016"},{"id":"27920779-23940276-34707963","span":{"begin":7265,"end":7267},"obj":"23940276"},{"id":"27920779-15529369-34707964","span":{"begin":7438,"end":7440},"obj":"15529369"},{"id":"27920779-21220454-34707965","span":{"begin":8212,"end":8214},"obj":"21220454"},{"id":"27920779-22615367-34707966","span":{"begin":8216,"end":8218},"obj":"22615367"},{"id":"27920779-21220454-34707967","span":{"begin":10239,"end":10241},"obj":"21220454"},{"id":"27920779-22615367-34707968","span":{"begin":10280,"end":10282},"obj":"22615367"},{"id":"27920779-11859119-34707969","span":{"begin":10839,"end":10841},"obj":"11859119"},{"id":"27920779-22407974-34707970","span":{"begin":11263,"end":11265},"obj":"22407974"},{"id":"27920779-21220454-34707967","span":{"begin":10239,"end":10241},"obj":"21220454"},{"id":"27920779-22615367-34707968","span":{"begin":10280,"end":10282},"obj":"22615367"},{"id":"27920779-22407974-34707971","span":{"begin":12524,"end":12526},"obj":"22407974"},{"id":"27920779-22407974-34707972","span":{"begin":13807,"end":13809},"obj":"22407974"},{"id":"27920779-11859119-34707973","span":{"begin":15207,"end":15209},"obj":"11859119"},{"id":"27920779-12943801-34707974","span":{"begin":15211,"end":15213},"obj":"12943801"},{"id":"27920779-8786330-34707975","span":{"begin":15215,"end":15217},"obj":"8786330"},{"id":"27920779-7916950-34707976","span":{"begin":15749,"end":15751},"obj":"7916950"},{"id":"27920779-22407974-34707977","span":{"begin":16371,"end":16373},"obj":"22407974"}],"namespaces":[{"prefix":"_base","uri":"https://www.uniprot.org/uniprot/testbase"},{"prefix":"UniProtKB","uri":"https://www.uniprot.org/uniprot/"},{"prefix":"uniprot","uri":"https://www.uniprot.org/uniprotkb/"}],"text":"Results\n\nAGY Ser Codons, but Not TCN Ser Codons, Are Enriched in Germline-Encoded CDR Sequences of IgV-Region Genes\nIt is well established that CDR Arg residues play a major role in specifying the nuclear reactivity of ANA (3). Moreover, in spontaneous SLE, many ANA arise by SHM of non-autoreactive Abs (1, 28–31), and this is often associated with the conversion of CDR germline-encoded AGY Ser codons into Arg codons (1). At the same time, germline IgVH, Vκ, and Vλ genes have unusually high frequencies of AGY Ser codons in CDRs, and this tendency holds for both mice and humans (1–3, 17).\nIf AGY Ser codon abundance in Ab CDRs were merely due to a selection pressure to preserve Ser residues among germline-encoded V-region genes, we would expect equally high frequencies of four other serine codons (TCN). However, CDR TCN codon abundance, as defined by observed/expected ratios, was inconsistent across mouse and human VH, Vκ, and Vλ genes, reaching only 2.3-fold more than expected in the most extreme case (mouse Vκ) and less than expected in mouse and human VH genes and mouse Vλ genes (Figure 1A). Moreover, in most cases, TCN abundance was higher in FRs than in CDRs. In contrast, AGY codons were far more abundant in CDRs than expected and consistently much more so than in FRs (Figure 1A). To avoid a bias in our analyses, we took expected frequencies from codon usage tables for mouse and human genes rather than the random expected frequency of 0.016 (1/61) for a given codon. This is because the TCG codon includes the rare CpG dinucleotide, so using 0.016 would inflate the expected cumulative frequency of TCN codons, thereby reducing observed/expected ratios for TCN.\nFigure 1 High frequencies of AGY, but not TCN Ser codons among germline-encoded CDR sequences of IgV-region genes. (A) Ratio observed/expected for AGY and TCN Ser codons in human and mouse IgV-region genes. Germline CDRs and FRs were defined using the Kabat numbering system. Expected ratio was defined by frequencies of 52,926 mouse codons and 40,662,582 human codons at http://www.kazusa.or.jp/codon/. (B). Total numbers of AGY or TCN Ser codons per germline-encoded CDR sequences. Box plots were generated as indicated in Section “Materials and Methods.” Briefly, the center line indicates the median; box limits indicate the 25th and 75th percentiles; whiskers extend to minimum and maximum values, and crosses represent sample means. Notches represent the 95% confidence interval for each median. (C) Donut graphs represent the number of CDR1\u00262 AGY Ser codons minus the number of TCN Ser codons for a given gene. The gray, white, and black areas denote the number of IgV genes in which AGY Ser codon numbers are greater than, equal to, or less than TCN codon numbers respectively. Number of sequences indicated in center. p values were determined using a two-tailed paired t-test. ***p \u003c 0.0001. In addition to comparing observed/expected ratios for AGY and TCN codons, we also compared absolute numbers of these codons in mouse and human germline VH, Vκ, and Vλ genes. Despite a greater number of possible TCN codons, the bias favoring AGY Ser codons was still evident in all three major families of V genes for both species (Figures 1B,C). These abundance data are in agreement with data reported by Wagner et al. (17), showing that CDR AGY codons outnumber TCN codons at most CDR positions. Finally, the serine codon bias was not restricted to the idiosyncrasies of the Kabat CDR/FR definitions used in our analyses because it also applied to CDRs defined by the IMGT system (Figure S1 in Supplementary Material). Collectively, these results show that high frequencies of germline AGY serine codons in CDRs cannot be explained solely by a selection pressure favoring germline-encoded CDR serine residues.\n\nCDR AGY Codon Bias in Ig Genes Is the Product of an Evolutionary Selection Pressure\nThe frequent use of CDR AGY Ser codons among IgV-region genes from two different species (human and mouse) led us to speculate that this feature might be highly conserved in evolution. Thus, we analyzed IgVH gene sequences of cartilaginous fishes (class Chondrichthyes), which are descendants of the most ancient species with an adaptive immune system. The immune systems of species in this class share major features with those of mammals, including SHM, although not class switch recombination (32, 33). Our analysis of germline VH sequences from four Chondrichthyes species indicated that, as in mice and humans, AGY but not TCN Ser codons were enriched in germline-encoded CDR sequences (Figures S2A,B in Supplementary Material). Thus, the CDR AGY codon bias is a highly conserved feature of IgV-region genes. A similar trend was also observed in several other less distant species, by Jolly et al. (18).\n\nPreferential Use of AGY Triplets in the Ser Codon Reading Frame\nBecause the AGC triplet has been shown to be an intrinsically preferred target for AID-dependent SHM (13, 15, 16, 34, 35), it is plausible that high frequencies of CDR AGY codons resulted solely from an evolutionary pressure to ensure high somatic mutation frequencies in CDR sequences during immune responses. This would be consistent with the fact that αβTCR genes do not share the CDR AGY abundance and bias features with Ig genes (17, 18) (Figures S2C,D in Supplementary Material). If CDR AGY codons were preserved solely to enhance mutability, we would predict that AGY triplets would be equally frequent in all three reading frames. However, this was not the case. Even when only one AGY base was required to be contained within a CDR for inclusion in the non-coding CDR frame counts, AGY triplets in the Ser reading frame were nearly always more frequent than the combined frequencies of those in the two other reading frames (Figures 2A–C). This trend also held for AGC triplets contained within the context of the extremely mutable AGCT sequence (16, 36) (Figures S3A,B in Supplementary Material). Finally, the intrinsically mutable AGC triplet was consistently more frequent in the Ser reading frame than was the combined frequency for GCT triplets in all three reading frames (AGC on opposite strand), the only exception being the small mouse Vλ gene family (Figure S3C in Supplementary Material). These results argue that the abundance of germline CDR AGY codons was not solely due to an evolutionary selection pressure for high CDR mutability via SHM.\nFigure 2 Preferential use of the AGY triplets among CDR sequences in the Ser reading frame. (A) Schematic of how AGY triplets in the different reading frames were determined at CDR boundaries. AGY triplets at CDR boundaries were counted in non-coding frames if one or two bases were located in the CDR. (B) Numbers of in-frame Ser AGC codons compared to combined numbers of AGC triplets in two non-coding frames. (C) Same analysis as in (B) applied to AGT. Box plots and whiskers were defined in Figure 1 and in Section “Materials and Methods.”\n\nArginine Residues in Antiviral Ab Are Often Created by SHM of AGY Ser Codons\nAn abundance of CDR codons that are prone to mutate to encode antinuclear Ab seemed paradoxical. However, there is speculation that a modest degree of autoreactivity may be beneficial to antiviral immune responses (37–39). For example, some viruses display host-derived nuclear material on their capsids that might enhance B cell activation or antibody efficacy due to an avidity effect (40). Therefore, we sought to determine if Arg residues are frequently generated via SHM in antiviral Ab. At first, we examined somatic mutations in broadly neutralizing antibodies (bNAbs) against HIV. Although we found that somatic mutations in AGY codons frequently produced Arg codons in these Abs, the results were not easily interpreted because overall mutation frequencies were extremely high, and in many cases CDR boundaries could not be defined due to insertions and deletions. Therefore, we extended our analysis to 298 published sequences of human antibodies against eight other virus species or subspecies. This analysis revealed frequent somatic mutations converting AGY Ser codons in CDRs to Arg codons.\nIn two human studies involving the H1N1 influenza virus (23, 24), 17 out of 46 and 24 out of 49 antibodies had at least one AGY Ser to Arg amino acid replacement resulting from SHM (Figure 3A). Arg replacement mutations in CDR sequences accounted for 2.9 and 3.1% of all V-region gene missense mutations (CDRs and FRs) in the two studies, with replacements at germline AGY codons comprising most of these (2 and 2.23%). A similar trend was observed in antibodies against hepatitis A, B, and C, rhino, dengue, avian influenza, and West Nile viruses. CDR Arg mutations accounted for 2.4–9.4% of all missense mutations in V-region genes for these antibodies, most of which (1.5–6.6%) occurred at germline CDR AGY codons (Figure 3B; Table 1).\nFigure 3 Somatically generated Arg codons often arise at germline CDR AGY Ser codons in antiviral immune responses. (A) Sequences and analyses from two studies of anti-H1N1 antibodies, as described in Section “Materials and Methods.” Heavy and light chains for a particular clone were combined to generate data for the graphs. The data combine the results of CDR and FR analyses. Any → Arg indicates a mutation at any non-Arg codon that gives rise to an Arg codon. Ser → Arg indicates an AGY Ser codon to Arg codon mutation. Numbers inside graphs indicate number of clones that were analyzed (heavy plus light chain). (B) Bars represent the average number of indicated replacement mutations among antiviral antibodies (heavy or light chain genes). Influenza #1 (n = 92), Influenza #2 (n = 98), Rhinovirus (n = 12), Avian Influenza (n = 27), West Nile (n = 6), Dengue virus (n = 4), Hepatitis A, B and C (n = 59).\nTable 1 Amino acid replacements via SHM of CDR AGY Ser codons.\nImmunogen Asn (%) Gly (%) Thr (%) Arg (%) Others (%) #CDR AGY SHM\nInfluenzaa 22c 16c 19c 11 32 142\nInfluenzab 30c 12 23c 16 19 107\nWest Nile 20 0 60c 20 0 5\nDengue 14 0 43c 29 14 7\nRhinovirus 7 0 15 26 52 27\nAvian Influenza 50c 0 17 33 0 12\nHep. A, B, and C 22c 18 18 20 22 72\naAntibody sequences from Wrammert et al. (23).\nbAntibody sequences from Li et al. (24).\ncAmino acid replacements that occurred more often than Arg replacements at CDR AGY Ser codons.\n\nCDR AGY Codons Frequently Mutate to Produce Codons for Key Ag-Contact Residues in the Ab-Binding Site\nOur analyses of somatic mutations in antiviral Ab led to an unexpected finding: CDR AGY Ser codons frequently mutated to Asn, Thr, and Gly codons in addition to Arg codons. Most of these mutations occurred by single-base changes, predominantly at the central base in the AGY triplet (Table 2), which is the position that is preferentially targeted by AID (13). In many cases, mutations to these alternative codons, particularly those for Asn and Thr, were more frequent than to Arg codons. For example, in anti-influenza Abs, CDR AGY mutations to Asn and Thr codons were each approximately twice as frequent as mutations to Arg codons. These observations were particularly revealing because in their analyses of numerous crystal structures of Ab–Ag complexes, Raghunathan et al. (19) identified Asn, Thr, Arg, Gly, Ser, Asp, and Tyr as key (i.e., most frequent) Ag-contact residues.\nTable 2 Base distribution of somatic mutations in CDR AGY Ser codons.\nImmunogen AGY (%) AGY (%) AGY (%) 2 changes (%) 3 changes (%)\nInfluenzaa 12 53 11 20 4\nInfluenzab 11 52 15 20 2\nWest Nile 0 80 20 0 0\nDengue 15 57 14 14 0\nRhinovirus 0 22 19 52 7\nAvian Influenza 0 67 33 0 0\nHep. A, B, and C 12 35 18 35 0\naAntibody sequences from Wrammert et al. (23).\nbAntibody sequences from Li et al. (24). In the report by Raghunathan and colleagues, it was not clear which contact residues were generated by SHM. To determine if residues frequently generated by SHM of AGY Ser codons are associated with Ab affinity maturation, we analyzed 72 (46 mouse and 26 human) Ab–Ag crystal structures available in the RCSB protein data bank (pdb) database, identified predicted Ag-contact residues, and searched IgBLAST to distinguish those that were germline-encoded from those that were somatically generated. When mouse and human data where combined, the seven most frequent Ag-contact residues were Arg, Asp, Asn, Gly, Ser, Thr and Tyr (Figure S4 in Supplementary Material). This result is identical to that of Raghunathan et al. (19), even though only 4 of the 72 structures we analyzed were also analyzed by them. Yet, we found that only three (Asn, Ser, and Tyr) of those seven residues (Arg, Asn, Asp, Gly, Ser, Thr, and Tyr) were present at higher frequencies than expected within CDRs of mouse and human germline IgV-region genes (Figure 4A). Importantly, amino acids resulting from SHM accounted for only 10–23% (average 14.7%) of all Ag-contact residues (Table 3 footnotes; Figure S4 in Supplementary Material). This is relevant to our conclusion regarding AGY versatility because it means that the seven key Ag-contact residues were largely defined by germline-encoded contacts; yet four (Asn, Arg, Gly, and Thr) of the seven most abundant contact residues arise frequently from somatic mutations at CDR AGY codons.\nFigure 4 CDR AGY Ser codons play a key role in affinity maturation. (A) Ratio observed over expected for synonymous codons in CDR sequences of combined IgV genes (VH, Vκ, and Vλ). (B) Percentage of the total contact residues that were created by SHM in V-region sequences only. Each data set represents a germline-encoded codon given rise to any contact residue. Black bars represent the percentage of AGY Ser codons that gave rise to a key contact residue defined by Raghunathan et al. (19).\nTable 3 Amino acid replacements due to somatic mutation of germline AGY Ser codons.a\nContact mutations at AGYb % of all contact mutationsc\nHuman Mouse Human (%) Mouse (%)\nArg 4 7 5.55 6.73\nAsn 5 9 6.94 8.65\nGly 0 1 0 0.96\nThr 6 5 8.33 4.81\nOthers 15 8 20.83 7.69\naData from 26 human and 46 mouse crystal structures of Ag–Ab complexes.\nbV-region contact residues arising from SHM of AGY Ser codons. Numbers expressed in absolute numbers. Total contact residues analyzed were 317 (human) and 886 (mouse). Total contact residues that were associated with SHM of a V-region codon were 72 (human) and 104 (mouse).\ncPercentage of total somatically generated contacts residues that arose from mutation of AGY Ser codons. For somatically generated contact residues, mutations at AGY Ser codons were the most abundant by far, and occurred ~2–3 times more often than mutations at AAY Asn codons (Figure 4B), the second most consistently mutated codon group. Most importantly, AGY Ser codons mutated to contact residues more often than any other codon group (Figure 4B), and a large proportion of these (~70%) were those defined as key Ag-contact residues. AGY mutations to codons for Arg, Asn, and Thr were the most consistent, and this was true for both contact and non-contact residues (Table 3 and data not shown). AAY triplets are also intrinsically preferred targets of SHM (13, 15, 16). However, when considering the potential to mutate to 1 of the 6 non-synonymous key contact residues (Arg, Asn, Asp, Gly, Ser, Thr, and Tyr), AGY Ser codons are able to do so via 12 out of 18 possible single-base changes. For AAY (Asn), this occurs with 8 out of 18 base changes, and for TCN, it occurs with only 6 out of 36 base substitutions (Figure 5), a result that is in agreement with the observation by Chang and Casali that CDR, but not FR sequences, are prone to acquire replacement mutations upon random point mutation (41). Collectively, the results of these analyses indicate that AGY codons contribute to Ab affinity both directly, by encoding a Ser residue, and indirectly due to the ease with which they mutate to encode other residues beneficial to the process of Ab affinity maturation. We believe this is the most straightforward explanation for the conservation of AGY codon abundance in CDRs of germline IgV-region genes.\nFigure 5 AGY Ser codons plasticity. Probability of creating a key non-synonymous contact residue by one nucleotide change. Filled gray boxes indicate a key Ag-contact residue as defined by Raghunathan et al. (19). White boxes indicate a synonymous change, a non-key contact residue (defined in the text) or a stop codon."}