PMC:6440665 / 8564-10982
Annnotations
2_test
{"project":"2_test","denotations":[{"id":"30602090-20111037-44841064","span":{"begin":1419,"end":1421},"obj":"20111037"}],"text":"Results\nFirst, total read counts were counted by either Picard tools or fgbio tools. An 8-plexed hybrid capture library generated 100 Gbp data, and approximately 15 Gbp data were generated from each sample. Considering the fact that the target genome size is 0.3 Mbp, the read coverages were calculated as 50,000× to 70,000×, as shown in Table 1. As on-target rate (%) represents how much the probes attach to the target genomic regions and as calculating on-target rates (%) refers to the efficiency of experiments, we examined the on-target rate (%). We were able to get 37% on-target rate, average (Table 1, Fig. 2). On-target rates were slightly increased when target regions were expanded to flanking regions (+100 bp). In addition, each sample showed a similar on-target %, and this may suggest that the hybridization step of targeted sequencing was processed with an even amount of each library.\nBefore removing sequence duplicates, we estimated duplicate read counts via UmiAwareMarkDuplicateWithMateCigar (Picard). We could measure duplicate rates of \u003e70% of the reads (Fig. 3). Considering the fact that the percent duplicate increases as the sequencing coverage rises [16], an average 73% of duplicates indicates that the targeting efficiency of the probes was high in high-coverage sequencing. In addition, as we were aware that an excess number of PCR cycles during target enrichment brings severe biases [17], we assumed that the number of PCR cycles during enrichment was moderate. Although duplicate reads were not discarded in further steps, by checking duplicate rates, we confirmed that the number of PCR cycles during library preparation and target enrichment was adequate. In addition, we were able to verify the ability to examine the efficiency of our experiments through data analysis.\nWe then estimated the number of consensus reads to be ready for more accurate variant calling. Through this filtering, we were able to get the consensus coverage down to one-sixth of the raw coverage (Fig. 4). According to the filtering methods, the consensus sequences of each read from different UMI families were called for scanning variants. In addition, by calling and filtering consensus reads, the probability of errors was considered, according to the algorithm of the tool. Furthermore, the reads containing Ns or the reads with low confidence were filtered out for highly confident variant calling."}