PMC:3475479 / 10584-13427 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"23105922-22081229-44845644","span":{"begin":688,"end":690},"obj":"22081229"},{"id":"23105922-22080106-44845645","span":{"begin":757,"end":759},"obj":"22080106"},{"id":"23105922-22026465-44845646","span":{"begin":824,"end":826},"obj":"22026465"},{"id":"23105922-21921910-44845647","span":{"begin":898,"end":900},"obj":"21921910"},{"id":"23105922-21878562-44845648","span":{"begin":923,"end":925},"obj":"21878562"},{"id":"23105922-21876176-44845649","span":{"begin":993,"end":995},"obj":"21876176"},{"id":"23105922-18714091-44845650","span":{"begin":1402,"end":1404},"obj":"18714091"},{"id":"23105922-21917492-44845651","span":{"begin":2546,"end":2548},"obj":"21917492"}],"text":"SNP discovery and genotyping with resequencing\nResequencing of genomic regions or target genes of interest in a phenotype is the first step in the detection of DNA variations associated with the gene regulation. The discovery of single-nucleotide polymorphisms (SNPs) including insertion/deletions (indels), with high-throughput data is useful to study genetic variation, comparative genomics, linkage map, and genomic selection for breeding value with DNA variation. Many geneticists for biological and genome studies of microbial, plant, animal, and human genomes have effectively used NGS whole-genome resequencing data to use in variable research fields, such as bacterial evolution [19], genomewide analysis of mutagenesis of Escherichia coli strains [20], comparative genomics of Streptococcus suis of swine pathogen [21], genomic variation effects on phenotype and gene regulation in mouse [22], evolution of plant [23], and comparison of genetic variations on the targeted enrichment [24]. The platforms of resequencing projects have used Illumina/Solexa of short read lengths to align with the reference sequence to discover DNA variations between compared related species' sequences. Because of rare occurrence of SNPs in most species, it is important to identify high-accuracy data to discover DNA variations according to coverage depth using MAQ (http://maq.sourceforge.net/maq-man.shtml) [25] and CLC software (http://www.clcbio.com). The public protocol of covering depth to discover SNPs and indels on the heterogeneous genome requires at least 30× of the reference genome, while about 10× depth of coverage is enough for DNA variation study of homogeneous genomes. Of course, high coverage of depth provides high-quality data in SNP detection on the reference mapping (Fig. 3). However, short read lengths of 35 bp or 100 bp show enough to map on the reference sequence using the MAQ software and CLC software in the genome, including short repeated block regions. But, geneticists still require long-read sequencing data to distinguish repeated block regions, like paralogous regions derived from gene duplication. MAQ software provides a consensus sequence of the genotype sequenced of short read lengths with aligned raw reads to the reference sequence. CLC software checks accuracy by counting reads of DNA variations of each position. Recently, a novel application of pattern recognition for accurate DNA variations was discovered in the complexity of the genomic region using high-throughput data in a Caucasian population [26]. They used three independent datasets with Sanger sequencing and Affymetrix and Illumina microarrays to validate SNPs and indels of a clinical target region, FKBP5. Therefore, it is necessary for multiplatform systems to validate DNA variations in the specific complexity of the genome region."}