Methods Genotyping, imputation, and GWAS Korean samples used in the East Asian T2DM meta-analysis have been described by Cho et al. [9]. The samples included 1,042 cases with T2DM and 2,943 controls from the Korea Association Resource (KARE) project that were recruited from Ansan and Ansung population-based cohorts, aged 40 to 69. We used the p-values of both the original unimputed (259,188 SNPs) and imputed (1,152,947 SNPs) genotypes, based on Affymetrix Genome-Wide Human SNP array 5.0 (Affymetrix, Santa Clara, CA, USA), that were made by removing samples and markers that failed a quality control test [10]. SNP imputation has been described by Cho et al. [9]. Briefly, IMPUTE, MACH, and BEAGLE were used, together with haplotype reference panels from the Japanese (JPT) and Han Chinese (CHB) samples that are available in the HapMap database on the basis of HapMap build 36. GSA-SNP and i-GSEA4GWAS The GO database was downloaded from the Gene Ontology Consortium (12,902 terms) [11], and the KEGG pathway database was downloaded from the KEGG (311 terms) [12]. For both databases, only the gene sets having 10-200 member genes were used (3534 GO and 211 KEGG terms). For both GSA-SNP and i-GSEA4GWAS, 20-kb padding was added to both ends of each gene. Usually, these methods, like i-GSEA4GWAS, pick up the best p-value and assign it to the encompassing gene. On the contrary, GSA-SNP allows one to choose different schemes for assigning the SNP p-values to each gene: either the best or the second best p-value within the gene boundary. Choosing the second best p-value has been recommended, as it may reduce the false positive associations with little loss of sensitivity [4]. For GSA-SNP, we downloaded the standalone program (as of Jan. 2011) and executed it locally. For i-GSEA4GWAS, we used the web server version by uploading the SNP p-values. GSA-SNP allows several approaches of evaluating gene set significance. While other approaches require p-values from permutation tests, Z-standardization requires no permuted p-values. The score of a gene is defined as -log of the p-value assigned to the gene. For each gene set, the scores of its member genes are averaged, and the Z-statistics of these scores are used to estimate the significance under the assumption of a normal distribution. The effect of multiple testing is corrected by the false discovery rate (FDR) method [13]. On the other hand, i-GSEA4GWAS compares the distribution of the member gene scores of a gene set to all the genes using K-S statistics and corrects the multiple testing effect using FDR that is based on SNP permutation tests. Variation in the number of member genes among gene sets is taken care of by multiplying a ratio of 'highly significant' genes in a gene set relative to those among all genes. Here, the 'highly significant' genes are defined as the genes that map with at least one of the top 5% of all SNPs in the dataset.