PMC:4996413 / 17184-19286
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/4996413","sourcedb":"PMC","sourceid":"4996413","source_url":"https://www.ncbi.nlm.nih.gov/pmc/4996413","text":"2.3. Statistical Pre-Selection of CpG Sites\nTwo statistical components have been used to pre-select a wide subset of candidate biomarkers prior to applying the data mining framework presented next. Statistics were applied for two separate experiments: controls vs. BCCA or controls [39] vs. LYCA. The goal was to pre-select a subset of CpG sites, in order to further feed the data-mining framework. These actually correspond to differentially-methylated CpG sites, with a statistical significance and beyond technical variation, as extracted using the statistical components for each of the separate experiments. These components correspond to (i) a scaled coefficient variation (Scaled CV) measurement and (ii) p-value measurements extracted by t-test and corrected by bootstrap. Scaled CV represents a robust measure of the real inter-class variability observed for a probe in the whole sample pool (controlsUcases), when compared to that observed among quality control samples, which measures solely the technical variation. The greater scaled CV is, the greater the real differential methylation is (beyond technical signal variation) and more reliable is the CpG site, as a candidate differentially methylated CpG site between the two sample categories (controls vs. cases).\nThe bootstrap corrected p-value measurements originate from a typical paired t-test for extracting statistically significant differentially-methylated probes (controls vs. BCCA or controls vs. LYCA). A paired t-test was possible since cases have been matched with controls for each of the experiment in terms of certain characteristics, e.g., age, sex, body mass index, and pre-post menopause for the breast cancer samples and their controls. The classical statistical test is followed, however, by a bootstrap p-value correction that immunizes statistical findings, against the detrimental effect of multiple hypothesis bias. The idea beneath this p-value correction is to examine whether the p-values obtained from the statistical test are indeed that extreme, or they could represent random false selections (see [34]).","divisions":[{"label":"Title","span":{"begin":0,"end":43}}],"tracks":[{"project":"2_test","denotations":[{"id":"27600245-24808224-69478311","span":{"begin":2097,"end":2099},"obj":"24808224"}],"attributes":[{"subj":"27600245-24808224-69478311","pred":"source","obj":"2_test"}]}],"config":{"attribute types":[{"pred":"source","value type":"selection","values":[{"id":"2_test","color":"#b8ec93","default":true}]}]}}