Results Reduced susceptibility of the rs2227473A allele in patients with TB in stage I data In stage I of 48 cases and 48 controls, the rs2227473 marker, which is located in -1756 bp upstream of the IL-22 gene transcription start site (TSS), was in Hardy-Weinberg equilibrium. The minor allele frequency of 0.103 in the control is comparable to the annotations in the HapMap (0.102 for CHB+JPT, 0.120 for YRI, and 0.208 for CEU). The ‘A' allele is associated with decreased risk of TB susceptibility at p-value of 0.0275, with odds ratio (OR) of 0.188 (confidence interval (CI) = 0.037–0.967), as shown in Table 1. The haplotype analysis was further performed in the surrounding region of this SNP. We found that the haplotype TCATGA—which is made of SNPs rs2227485, rs2227484, rs2227483, rs2227478, rs2227473, and rs2227472—shows a trend of association with decreased risk of TB susceptibility at p-value of 4.89E-5 (Supplementary Table S1). We therefore selected these 6 SNPs for genotyping in the second stage. Reduced susceptibility of the rs2227473A allele in patients with TB in stage II and pooled data The association of rs2227473 was confirmed in the stage II experimental dataset with 431 cases and 310 controls. As shown in Table 1, the association is significant at a p-value of 0.0343, with OR = 0.694 (95% CI = 0.494–0.975). Analysis on the pooled data (479 cases and 358 controls) showed even more significant association at a p-value of 0.0086, with OR = 0.653 (95% CI = 0.449–0.896). This SNP was predicted to alter the protein-DNA interactions among several transcription factors (TFs), including three TFs from the UniPROBE database, and two TFs from the Jaspar database (Supplementary Table S2). Marginal association between rs2227473 and TB in validation cohort The association was further validated by an independent cohort with 413 cases and 241 controls, which were collected after Feb., 2010 (Table 1). The association was marginally significant in this dataset, with p-value of 0.061, OR = 0.702 (95% CI = 0.484–1.107). However, when we combined the experimental and validation cohorts, the association was very significant at a p-value of 0.001 (OR = 0.663, with 95% CI = 0.518–0.847). Effects of rs2227473 genotypes on IL-22's protein expression To test whether different genotypes of the rs2227473 affect IL-22's expression levels, we stimulated patients' peripheral blood mononuclear cells (PBMCs) with both anti-CD3/anti-CD28 antigene nonspecific and Mtb antigen-specific stimulations. Patients with A allele (GA+AA, n = 29) at rs2227473 had significantly higher IL-22 protein productions than those without A allele (GG, n = 29) under both non-specific (p value = 0.0095) and specific stimulations (p value = 0.0099, Figure 1). Other polymorphisms in the IL-22 promoter region that reduce the susceptibility of TB in stage II and pooled data Among the other five SNPs we genotyped in stage II of the experimental cohort, four of them showed significant association with TB susceptibility (Table 2). The locations of the four SNPs to the IL-22 TSS are rs2227472 (-1851bp), rs2227478 (-1340bp), rs2227483 (-894bp), and rs2227485 (-431bp). SNP rs2227472 is 95 bp upstream of our major SNP, rs2227473. Its A allele is a major allele in controls, but a minor allele in cases in both stage II and pooled data. Similar patterns were observed for two other SNPs, rs2227478 and rs2227485, which are 416 bp, and 1325 bp downstream of rs2227473, respectively. All the four SNPs have p-values in the range of 0.021–0.044, with OR range from 0.74–0.80 (Table 2) for the stage II data. Similar results were observed for the pooled data. Haplotypes in the IL-22 promoter region that affect the susceptibility of TB in pooled data We observed very high linkage disequilibrium among the five SNPs in the control population of the pooled data (Figure 2). We thus performed the association study between the haplotypes of these five SNPs and TB susceptibility. As shown in Table 3, haplotype CTTAA (SNPs in the order of rs2227485, rs2227483, rs2227478, rs2227473, and rs2227472) shows very strong association with decreased TB susceptibility, at p-value of 2.12E-6, with OR at 0.04 (95% CI = 0.01–0.35). On the other hand, the haplotype TATGG, which consists of all major alleles, shows significant association with increased TB susceptibility at p-value of 0.01, with OR at 1.49 (95% CI = 1.05–2.13). In the pooled data of the two stages, the p-value is slightly more significant for rs2227478, and a bit less significant for rs2227472, when comparing with stage II alone. The linkage disequilibriums (LDs) of the five SNPs are given in Figure 2, showing moderate LDs. The association between SNPs in the IL-22 promoter region and TB susceptibility in pooled data stratified by age at diagnosis (≤25 and >25) We further stratified our pooled data according to the age when the patients were admitted to the hospital for the experimental cohort. As shown in Table 4, out of five SNPs showing significant association in the pooled data, three SNPs (rs2227472, rs2227483, and rs2227485) showed more significant association in younger patients but not in elder patients, and two other SNPs (rs2227473 and rs2227478) showed significant association in elder patients but not in younger patients. Notably SNP rs2227485, which is marginally significant in the pooled data (p-value = 0.049), showed very significant association with decreased TB susceptibility in younger patients (age < = 25 years old) at p-value of 5.2E-5, with OR = 0.496 (95% CI = 0.303–0.663). But for the older group (age >25), this SNP did not show any association with TB susceptibility. A similar situation is true for SNP rs2227472.