Hi!
I have been looking to this post How To Extract Reads With Wrong Insert Size With Samtools and I wonder how could I extract from a bam file just those reads with insert size of 0.
I have been trying
samtools stats -i 0 sample.bam
Thanks!
Hi!
I have been looking to this post How To Extract Reads With Wrong Insert Size With Samtools and I wonder how could I extract from a bam file just those reads with insert size of 0.
I have been trying
samtools stats -i 0 sample.bam
Thanks!
To see what types of characteristics are available vie the FLAG, check out the Broad Institute's Explain Flag Site. As you can see, the insert size is not part of that list. However, depending on the alignment software you used, the cases you've described may be captured by the flag for "proper pair".
The fragment size is usually recorded in the 9th column (TLEN), if I remember correctly (just double-checked and found this post that goes into more details; definitely read through the post and SAM documentation linked there to get a better understanding of the SAM format).
samtools view myfile.bam | awk '$9 == 0 {print $0}'
might do the trick then (have not tested this and it's probably not very fast either). It seems like bamtools' filtering tool might offer the same functionality.
EDIT: Istvan offered a more elegant solution in this post
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
The command to extract records from a SAM/BAM file is
samtools view
(as shown in that post) - I'm not sure where you're gettingsamtools stats
from.But how can I get it? because I have seen no flags for this kind of reads
What does it mean when you say insert size of
0
?I would like to obtain the paired reads that for some strange reason are mapped in the same positions, having no insert size.
Only scenario where I suppose that can happen is if the
size of insert = number of cycles of sequencing
. So you would need to find paired reads which have identical mapping start on the same chromosome. Otherwise it would be unlikely.