Entering edit mode
8.6 years ago
prasoon.agarwal
▴
10
I have bam files and want to extract the reads that overlap the region provided e.g. I used the following:
samtools view -f 0x02 file_sorted.bam chr3:186749777-186749777 > chr3_reads.txt
is this the most accurate way or some other more specific method is available. I have a mutation in the above location and want exactly all reads that overlap this location
Thanks
You also need
-F 12
because unmapped reads can be in a proper pair.Other than that, this is not a question of accuracy. For operations like this, samtools and all other tool will be 100% accurate. The question is do you want to include reads which over lap the region boundry? What about pairs where 1 read is in the region and 1 is outside so the whole fragment overlaps the boundry? I cant remember what samtools does by default, but it doesn't matter because you have to be the one to decide first how you want it done, then find the appropriate tool :)
There are many ways to do it with a little bit of tweaking
https://github.com/genome/bam-readcount
https://github.com/alimanfoo/pysamstats
https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_rnaseq_ASEReadCounter.php
https://github.com/pysam-developers/pysam ( generate a pileup and count ref/alt bases )