Methods Subjects and samples Patients with different manifestations of active TB were recruited at clinics in the Shenzhen Third People's Hospital and Shenzhen Polytechnic College in Shenzhen, China. Healthy adults with no history of TB disease were also recruited as the control group. All participants had received bacillus Calmette-Guérin (BCG) vaccination at birth. We used M. tuberculosis-specific IFN-γ enzyme-linked immunospot (ELISPOT) assay to exclude LTBI from healthy donors, and only those who were ELISPOT negative were selected as our healthy controls. The diagnosis of tuberculosis was based on mycobacterium tuberculosis examination, clinical symptoms, and chest X-ray examination as described before13. Samples collected before Feb., 2010 were used as our experimental cohort (479 cases and 358 controls) and samples collected after that date were used as our validation cohort (413 cases and 241 controls). The statistics of age distributions and male/female ratios are listed in Supplementary Table S3. The study obtained ethical approval from the Institutional Review Board of the Shenzhen Third People's Hospital, and informed written consent was obtained from all the patients. Clinical specimens from patients with TB were collected within one week after anti-TB treatment. Whole blood was collected by venipuncture from the populations mentioned above. SNP selection Since we focus on the SNPs with regulatory roles, we are particularly interested in two types of SNPs: (a) the SNP that is in a putative transcription factor binding site, and its two alleles that alter the binding scores, and (b) the SNP that is in a putative microRNA target site and the two alleles that alter the miRNA-target interaction scores. To search for the SNPs with alleles that alter the putative transcription factor binding sites, we scanned the promoter region of the gene with annotated Position Weight Matrices (PWMs) in Jaspar33, UniPROBE34 and TRANSFAC35 databases. We first obtained the gene coordinates (txStart and txEnd) for human genome hg18 (NCBI36) from the RefSeq table of UCSC table browser (http://genome.ucsc.edu/). For the annotated SNPs in dbSNP129 that are within 2000 bp upstream to 500 bp downstream of the IL-22 gene, we extracted their flanking sequence (+-25bp) from the dbSNP website (http://www.ncbi.nlm.nih.gov/projects/SNP/). For each SNP, we kept the flanking sequence the same and changed the alleles, thus obtaining one sequence for each allele and forming a sequence set. We then used a PWM_SCAN algorithm36 to scan each sequence in the set to test whether it had a putative binding site (PBS), with the method we described in37. We considered sites with a probability score of p-value < = 0.001 as PBS. If the SNP was within the PBS and any of the two alleles had differential p-values, we calculated the ratio (Sr) of p-values by dividing the bigger p-value by the smaller p-value. The Sr measures how the two alleles in the PBS change the binding scores between the PBS and putative binding protein. We then converted this Sr into a p-value based on a background distribution from a permutation test, based on the FastPval program38. If an SNP has Sr with p-value < 0.01, we considered this SNP as functional and thus selected it for genotyping. To search for the SNPs with alleles that alter the putative microRNA target sites, we first searched the PITA database for the microRNAs that could target the 3′ UTR region of the IL-22 gene. For the SNP that is in the target site, we obtained its flanking sequence (+-50bp) from the UCSC genome browser according to its genomic location. We then changed the polymorphism site and obtained one sequence for each of the alleles. First, we tested whether the change of alleles also changed the microRNA-target interaction by RNAhybrid. The RNAhybrid calculates the minimum hybridization energy between the microRNA and its target. If the alleles in the polymorphic site change this energy significantly, it might enhance or repress the regulatory function of the RNA, and thus plays a role in the disease. Second, we used RNAfold to calculate the thermodynamics of the putative target site. If the alleles change the thermodynamics of the target, this might affect the chance of the microRNA binding to its target, and thus affect the function of the microRNA. SNPs with statistical significance in either target thermodynamics or microRNA-target interaction were selected for genotyping. In addition to the regulatory SNPs, we also used the traditional tagging procedure to select SNPs that cover the IL-22 gene39. We chose the CHB+JPT HapMap panel, included our regulatory SNPs, chose r2 = 0.8 and used default settings for all other parameters. We also selected SNPs in the IL-22 gene region that are reported to be associated with disease. The functional, tagged, and selected SNPs are shown in Supplementary Table S4. The consensuses of the putative binding sites of each selected SNP are shown in Supplementary Table S2. SNP genotyping Genomic DNA was extracted with a DNA isolation kit (Qiagen Inc., Germany), following the manufacturer's instruction. SNPs were typed by a high-throughput Sequenom® genotyping platform (San Diego, CA, USA). The genotypes were determined by a Homogenous Mass EXTEND assay. The Mass ARRAY AssayDesign software was used to design allele-specific extension primers. The primers designed for each selected SNP are listed in Supplementary Table S5. The genotyping for the experimental cohort was carried out in Shenzhen, and genotyping for the validation cohort was carried out in both Shenzhen and Hong Kong, with concordance >98%. Statistical analysis We used PLINK v1.07 and Haploview v4.0 for all the analysis undertaken. At each stage, SNPs that failed the Hardy-Weinburg equilibrium (p-value < 0.05) were removed from further analysis. We used allelic association tests to compare the frequency of SNP alleles in cases and controls. OR and 95% CI were estimated using logistic regression models. We performed allelic association tests for different age groups (< = 25 and >25). The haplotype association test was performed in the region surrounding the rs2227473 marker. SNPs or haplotypes with p-values < 0.05 were considered to be significant. Quantifications of IL-22 under specific and non-specific stimuli of PBMCs 5ml peripheral blood samples of TB patients were collected and PBMCs were isolated by the Ficoll lymphocyte separation medium. PBMCs were then spread into a 96-well plate at the cell density of 4×105, and 1ug/ml of monoclonal antibodies anti-CD3 and anti-CD28 were added for non-specific stimulation, and 20ug/ml of Mycobacterium tuberculosis (Mtb) lysate were added for specific stimulation. The stimulated PBMCs were incubated at 37°C, 5% CO2 for 96h. Cell supernatants were used for IL-22 protein quantification with ELISA.