PMC:3475479 / 6210-10582 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"23105922-22260654-44845636","span":{"begin":779,"end":780},"obj":"22260654"},{"id":"23105922-22053731-44845636","span":{"begin":779,"end":780},"obj":"22053731"},{"id":"23105922-21075933-44845636","span":{"begin":779,"end":780},"obj":"21075933"},{"id":"23105922-21478339-44845636","span":{"begin":779,"end":780},"obj":"21478339"},{"id":"23105922-21994929-44845636","span":{"begin":779,"end":780},"obj":"21994929"},{"id":"23105922-22192914-44845637","span":{"begin":806,"end":807},"obj":"22192914"},{"id":"23105922-21183663-44845638","span":{"begin":1336,"end":1338},"obj":"21183663"},{"id":"23105922-20010809-44845639","span":{"begin":1687,"end":1689},"obj":"20010809"},{"id":"23105922-21186353-44845640","span":{"begin":1862,"end":1864},"obj":"21186353"},{"id":"23105922-22192763-44845641","span":{"begin":3057,"end":3058},"obj":"22192763"},{"id":"23105922-20386741-44845642","span":{"begin":3418,"end":3420},"obj":"20386741"},{"id":"23105922-21149342-44845643","span":{"begin":3976,"end":3978},"obj":"21149342"}],"text":"Novel whole genome de novo assembly\nMore than 11,000 sequencing projects, including targeted projects, were reported on the Genome Online Database (GOLD, http://www.genomesonline.org) in early 2012. Now, more than 3,000 genome projects have been completed on the diverse genome species, and more than 90% of completed projects were bacterial genome sequencing. The greatest bacterial genome sequencing was performed with 454 pyrosequencing because of the available largest long read sequencing, useful for de novo assembly of novel genome sequencing. The official depth of the deep sequencing strategy of 454 pyrosequencing technology for whole bacterial genome sequencing for de novo assembly in novel genome sequencing is at least 15-20× in depth of the estimated genome size [9-13]. However, Li et al. [3] reported that 6-10× sequencing in qualified runs with 500-bp reads would be enough for de novo assemblies from 1,480 prokaryote genomes with \u003e98% genome coverage, \u003c100 contigs with N50, and size \u003e100 kb. Recently, prokaryote whole genome sequencing using 101 bp paired-end read data from Illumina/Solexa systems was used for de novo assembly and resequencing. For example, a Bacillus subtilis subspecies genome sequence was generated by using the short read sequence from Illumina/Solexa and assembled with the Velvet program [14]. In this case, the genome assembly was completed, based on the reference genome for ordering the numerous contigs derived from de novo assembly. Even though numerous contigs assembled with Illumina/Solexa data were produced in the eukaryotic genome, a few drafts for the assembled genome sequence were reported, except for the giant panda genome [15], which was covered with assembled contigs (2.25 Gb), covering approximately 94% of the expected whole genome. Another example was the woodland strawberry genome (240 Mb) [16] that was sequenced to 39× depth of the genome, assembled de novo, and anchored to the linkage map of seven pseudochromosomes.\nThe genome sequence could be associated with the predicted genes with transcriptome sequence data. An ideal method for cost-effective novel genome sequencing using NGS is de novo assembly with diverse shotgun fragment end sequencing data of multiplat systems (Fig. 1). The first strategy of novel genome DNA sequencing is sequencing the genomic DNA for contig and scaffold construction after randomly sheared shotgun single read-end or paired-end read DNA sequencing using Roche/454 or Illumina/Solexa with information on how to assemble with the NGS data using variable assembly software. Recently, a catfish genome was sequenced with multiplatform Roche/454 and Illumina/Solexa technology and assembled with an effective combination of low coverage depth of 18× Roche/454 and 70× Illumina/Solexa data using 3 assembly softwares - Newbler software to the 454 reads, Velvet assembler to the Illumina read, and MIRA assembler for final assembly of contigs and singletons derived from initial assembled data - resulting in 193 contigs with an N50 value of 13,123 bp [2]. In an additional multiplatform data assembly of a 40-Mb eukaryotic genome of the fungus Sordaria macrospra, a combination sequence of 85-fold coverage of Illumina/Solexa and 10-fold coverage by Roche/454 sequencing was assembled to a 40-Mb draft version (N50 of 117 kb) with the Velvet assembler as a reference of a model organism for fungal morphogenesis [17]. In the recent effective assembly methods reported, combinations of the multiplatform sequence are shown as successful novel genome assembly using variable assembly strategy pipelines. Comparing the pipeline of assembly strategy, we suggest an effective integrated pipeline in which data are filtered to remove low-quality and short-read initial assemblies using variable software and then compared to contigs, hybrid contigs using MIRA assembler, and finally contig orders using SSPACE software (http://www.baseclear.com/dna-sequencing/data-analysis/) [18] for scaffold construction through de novo assembly of novel genome sequencing (Fig. 2). According to the comparison of several ways of de novo assembly, we suggest using both DNA sequences from multiplatform NGS with at least 2× and 30× depth sequences of genome coverage using Roche/454 and Illumina/Solexa, respectively, and doing hybrid assembly for cost-effective novel genome sequencing."}