Discussion This study evaluated the ability of the GRS, which aggregates information from multiple genetic variants, to improve the prediction of CRC risk beyond the prediction risk afforded using conventional risk factors. For both men and women, inclusion of counted GRS or weighted GRS increased the AUC by 0.5% to 4.2% beyond the AUC provided by conventional risk factors, such as age and family history of CRC. Men with a positive family history of CRC and GRS in the highest quartile were determined to have a statistically significant increased risk of CRC than those without a family history of CRC and GRS in the lowest quartile. However, women with a positive history of CRC and GRS in the highest quartile were determined to have an increased risk of CRC than those without a family history of CRC and GRS in the lowest quartile, but this result was not statistically significant. CRC is a multifactorial disease involving a variety of elements, leading to the development of clinical manifestations [37]. This recognition had led to the development of risk assessment tools that attempt to synthesize the values of numerous variables into a single statement regarding the risk of developing cancer [38]. In this study, 20 SNPs were respectively genotyped in Korean men and women. Among these SNPs, 3 SNPs in Korean men and 5 SNPs in Korean women showing the strongest association with CRC were used for the calculation of GRS. The GRS was calculated using a linear weighting of 0, 1, or 2 for genotypes containing 0, 1, or 2 risk alleles, respectively. The weighted GRS was computed by multiplying each beta-coefficient by the number of corresponding risk alleles. However, when multiplying each beta-coefficient by the number of corresponding risk alleles, negative values of beta-coefficients may be obtained in some genotypes of the SNPs. Therefore, it may affect the OR values for CRC when compared to the OR values for CRC determined using counted GRS. Still, the calculation results of both counted GRS and weighted GRS were similar to each other. Cornelis et al. [31] and Ripatti et al. [32] used methods similar to the GRS created for our study. Several other studies have reported different ways of calculating risk scores for the prediction of diseases [39-41]. Horne et al. [39] introduced a regression method for calculating risk scores that incorporated 3 genetic polymorphisms and other risk factors and found that the frequency of coronary heart disease was different at different regression score levels. Ortlepp et al. [40] concluded that multiple SNPs are better than single SNPs and that as many as 200 SNPs may be necessary for "reasonable" genetic discrimination. Aston et al. [41] suggested that a score based on 90 SNPs in 78 genes can predict the risk of breast cancer, but the identity of the SNPs and the algorithm for calculating the score remain proprietary. An alternative way to calculate GRS using machine approaches, such as support vector machines (SVMs), could be introduced, as SVMs have already been used to deal with many biological problems, such as DNA expression profiles [42]. Still, further studies are needed to use machine learning approaches, such as SVMs, for the calculation of GRS. To our knowledge, there have been no studies evaluating a GRS using SNPs contributing to CRC for the prediction of the disease in the Korean population. The present study evaluated a prediction model using counted GRS or weighted GRS together with conventional risk factors, such as age and family history of CRC among Koreans. The risk of CRC is said to increase in individuals with a family history of CRC, in particular those >50 years of age [43, 44]. From a recent study, a CRC prediction model was developed with known major risk factors of age, BMI, alcohol consumption, smoking status, and physical activity level for middle-aged Japanese men [17]. Another recent study on the prediction model of CRC included an individual's age, sex, history of CRC, sigmoidoscopy/colonoscopy, polyps, family history of CRC, smoking, physical activity, aspirin/NSAID use, vegetable intake, BMI, and hormone replacement in women [16]. In our study, the prediction model of CRC was comprised of conventional risk factors, such as age and family history of CRC, together with the GRS. As determined, inclusion of counted GRS or weighted GRS revealed improved estimates of CRC prediction beyond that provided by conventional risk factors, such as age and family history of CRC. For example, when counted GRS were added to the prediction model of CRC consisting of age and family history of CRC, the AUC increased by 4.2% in men and 5.2% in women, whereas the AUC increased by 3.2% in men and 4.8% in women when weighted GRS were added to the same model. Studies showing significant relationships of GRS in conjunction with coronary heart disease, type 2 diabetes, and breast cancer have reported that considering the contribution of multiple SNPs may improve the predictive value of GRS for such diseases [18, 31, 45]. In other words, combining multiple loci with modest effects into a global GRS might improve identification of persons who are at risk for such diseases [23-25]. For example, in the ARIC study, the contribution of multiple SNPs into a single GRS was responsible for an improvement in the prediction of incident CHD [18]. In a study that used counted GRS or weighted GRS to determine the risk for type 2 diabetes in US men and women, individuals in the highest quintile of GRS had a significantly increased risk of type 2 diabetes compared to those in the lowest quintile; however, the addition of GRS increased the AUC by only 1%. In this instance, the GRS was determined to be useful when combined with the joint effects of BMI and counted GRS or family history of diabetes and counted GRS [30]. In our study, individuals in the highest quartile of GRS had increased risk of CRC compared to those in the lowest quartile of GRS for both men and women. In addition, in strata of family history of CRC and GRS, this increase was even higher in individuals with a family history of CRC in the highest quartile of GRS compared to those without a family history of CRC in the highest quartile of GRS in both men and women. Still, there were statistically significant interactions in men but not in women. In this study, the most commonly used conventional risk factors, such as smoking and alcohol consumption, were also not included in the prediction model of CRC, as smoking, alcohol consumption, BMI, and WC did not significantly interact with the GRS. Therefore, further studies are needed to verify these results. A family history of CRC is commonly used as a surrogate marker for determining genetic susceptibility to CRC and remains one of the strongest risk factors for the disease [10, 31, 46]. Approximately 25% of all CRC cases occur in individuals with a family history of the disease and no genetic disorders [47]. In addition, some retrospective studies have suggested that a history of CRC in a first-degree relative (a parent or sibling) elevates a person's lifetime risk of CRC from 1.8-fold to 8.0-fold [10, 47]. This family history risk factor may encompass both genetic and shared environmental components [31]. In our study, after controlling for age and GRS, the strong relationship between family history of CRC and risk of CRC persisted. These findings suggest that other risk loci remain to be discovered or that family history has a much larger shared environmental component than previously thought [31]. Our study was not without limitations. The cross-sectional design precluded the determination of causality, and a prevalent case bias may exist due to the higher number of prevalent cases (n = 165) of CRC included compared to the number of incident cases (n = 22). Combining prevalence and incidence cases could introduce survival biases. Still, the 5-year survival rate for CRC in Koreans was 71.3% in 2009 while that in Americans, Europeans, and Japanese was 65.0%, 56.2%, and 65.2%, respectively [43]. It could be said that Koreans have higher survival rates for CRC compared to other ethnic populations. Additionally, this study is also a case-cohort study. Blood samples of prevalent cases used in this study were from baseline, and incidence cases during the follow-up period, suggested as prevalent cases in this study, might have been missed, as other blood samples were not taken. Therefore, those prevalent cases at baseline might have become incident cases or mortality cases during the follow-up period. It is hard to say if this study was performed among survivors. Another limitation included a self-reported family history of CRC, thus precluding the definitive exclusion of potential misclassifications. The statistical power of the current study might be too low, as genotyping was performed separately for men and women. In addition, performing multiple tests separately in both men and women may increase error rates. Although CRC affects men and women equally, gender differences in CRC may exist. For example, regarding colorectal polyps and tumors, men had a greater risk of polyps (OR, 1.52; 95% CI, 1.41 to 1.64) and tumors (OR, 1.43; 95% CI, 1.22 to 1.68) than women. In addition, women had greater number of purely right-sided polyps and tumor development [48]. Therefore, detection of genetic effects separately in men and women may be needed. In addition, age differences in case and control participants may also increase error rates, as control participants may become CRC patients when they reach the case age. This study also involved the lack of validation and replication of the current study results. Therefore, it is hard to say that there may have been a true association between GRS and CRC in Korean men and women. However, bootstrapping and 10-fold cross validation were used for internal and external validity of this current study. Furthermore, although sigmoidoscopy/colonoscopy history was the strongest risk factor in the previous study, this current study did not include it as one of the conventional risk factors of CRC. Cases included in this study were also relatively small. This study also excluded cases ≥ 55 years of CRC onset age to obtain early-onset CRC cases. Therefore, estimate effects of cases ≥ 55 years of CRC onset age were hard to be seen in this study. Finally, most SNPs found to be associated with CRC among the study population were not similar to those SNPs found in relation to CRC among other populations. It also could be due to differences in ethnic population and the ages of case participants included in this study (cases < 55 years). Nevertheless, this relatively large-scale study demonstrated the effectiveness of the prediction model of CRC using the GRS consisting of only SNPs that associated significantly with CRC and evaluated the effects on the risk for CRC in combination with conventional risk factors, such as family history of CRC, with the GRS. Moreover, the present study included the Korean population, whereas previous studies involving CRC prediction models using conventional risk factors or the relationship of genetic risk factors to CRC were limited to white and Japanese populations [17, 20-22]. In conclusion, our findings suggest that the prediction model of CRC revealed improved prediction estimates when age, family history of CRC, and the GRS in the Korean population were included. Furthermore, when compared to those in the lowest quartile of GRS in the presence or absence of a family history of CRC, the risk of CRC was found to be significantly increased in individuals with a family history of CRC in the highest quartile of GRS. However, it was statistically significant in men but not in women. Findings in this current study might provide a small piece of evidence in prediction of CRC for reducing its prevalence and incidence rates. The prediction model developed in this study needs to be validated or replicated in an independent population. Therefore, further studies are needed to be applied to the general population.