PMC:3543920 / 10764-12881 JSON TXT

Annnotations TAB JSON ListView MergeView

{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/3543920","sourcedb":"PMC","sourceid":"3543920","source_url":"https://www.ncbi.nlm.nih.gov/pmc/3543920","text":"False Positive and False Negative Calls\nRecent improvements of the WES technique in both experimental and bioinformatics pipelines have reduced the occurrence of false positive (false variant that is called true) and false negative (true variant that is failed to be called) variant calls substantially. Currently, we can achieve ~98% sensitivity and 99.8% specificity from exome sequencing data, minimizing the chance of missing any true variants as compared to the analysis of SNP array data (data not shown). However, as mentioned previously, it is technically challenging to detect and evaluate rare heterozygous variants because heterozygous variant calling is more susceptible to the technical errors and require higher read depth. For example, at a coverage depth of 4× (i.e., if a base is covered by 4 independent reads) and assuming a 1% per-base error rate, a homozygous variant will be called if 0 and 4 reads are observed for the reference and nonreference alleles respectively, with a false positive rate of 2 × 10-4. However, if both reference and nonreference alleles have 2× coverage, a heterozygous variant will be called with a false positive rate of 0.34. This presents a particular problem when one considers the uneven coverage in WES resulting from differential capture efficiencies across the exome. To achieve reliable heterozygous calling, it is therefore necessary to achieve sufficient coverage depth across the entire exome so as to minimize the number of bases with low coverage.\nAnother challenge to analysis is read misalignments; the typical length of NGS reads is near or less than 100 bp, and even paired reads are subject to being improperly aligned, because there are large portions of the human genome where the DNA sequence is highly repetitive and duplicated. In fact, segmental duplicated regions, defined as intervals larger than 1 kb having a homology \u003e90% with other parts of the genome, encompass about 5% of the human genome [15]. Hence, imposing more rigorous filtering criteria to remove such reads, including Phred score and mapping quality score cutoffs, is essential.","divisions":[{"label":"Title","span":{"begin":0,"end":39}}],"tracks":[{"project":"2_test","denotations":[{"id":"23346032-12169732-44837252","span":{"begin":1971,"end":1973},"obj":"12169732"}],"attributes":[{"subj":"23346032-12169732-44837252","pred":"source","obj":"2_test"}]}],"config":{"attribute types":[{"pred":"source","value type":"selection","values":[{"id":"2_test","color":"#e4ec93","default":true}]}]}}

PMC:3543920 / 10764-12881 JSONTXT

Annnotations TAB JSON ListView MergeView

PMC:3543920 / 10764-12881 JSON TXT