Hi friends,
Recently, we performed exome sequencing for 3 samples using Nextera sequencing machine. We used new kit for exome sequencing. So I am interested in finding out the on-target and off-target percentage of reads from my exome sequencing run.
This is my understanding, On-target - Reads that are aligned to the regions that are targeted (exome regions as per the manifest file). Off-target - Reads that are aligned to the regions which are not targeted.
What should be the percentage of reads covering on-target region? and How can we calculate the on-target reads from BAM file? Which tool is useful in getting the percentage of reads covering on-target and off-target regions?
Calculating Exome Coverage
Picard: https://broadinstitute.github.io/picard/picard-metric-definitions.html#HsMetrics
Thanks, genomax2. I am currently working on the output files from picard.
Hi Genomax2,
This is the output I got from picard hsmetrics. I am getting the on-target bases and on-bait bases (bait otherwise probes used for exon capturing). I used two kinds of bam file, a) just sorted bam and b) sorted, deduped, recalibrated bam. I believe that I should stick with the clean.dedup.recal.bam statistics.
1) What is vendor's filter?
2) I could see off_bait_bases, but not off-target_bases.
3) Will I be able to get the reads percentage for on-target and off-target?
The reason why I want reads on-target is,
A base within a read is considered on target if it is aligned with a targeted region. A read is considered on target if a single base within a read aligns to a targeted region. Measuring reads on target might be more accurate in representing the target fragments.
Dear Friend, I have a sample whose read count is 51,578,482. After Duplicate removal, i have about 46,393,168 reads with me!! Mapped read count is 46,676,207(99.36%) & on Target read count is 38,221,722 (81.36). Do u think it is a good output? What should be the ideal on target mappability in term of %. I have used Agilent V6+UTR, 150 PE
Please give your valuable output!!