I DO NOT want to filter on "properly paired" reads within a reasonable insert size.
I DO want to filter on paired reads no matter the insert size.
That's reads like this
--> <--
with no insert size restriction using BAM files
Thanks :)
I DO NOT want to filter on "properly paired" reads within a reasonable insert size.
I DO want to filter on paired reads no matter the insert size.
That's reads like this
--> <--
with no insert size restriction using BAM files
Thanks :)
There should be a straightforward solution for it but right now all I can think of is to use appropriate SAM flags ( second column of the SAM format) to extract such reads. Below are the flags for the reads you are interested in. You will also have to make sure that both reads belong to the same chromosome as shown in the awk one liner.
81 read paired, first in pair, read reverse strand, mate forward strand
97 read paired, first in pair, read forward strand, mate reverse strand
145 read paired, second in pair, read reverse strand, mate forward strand
161 read paired, second in pair, read forward strand, mate reverse strand
samtools view -f 81 -f 97 -f 145 -f 161 Input.bam | awk '$7=="\="{print $0}' > Input.sam
Add header and convert it to bam.
Modified later: I just realized that using samtools view -f 145 gives you reads flagged as 145, 147 and 177. You may be fine with 147 but not with 177. I think you may have to use awk to extract reads flagged as 145 and 147.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
@Ashutosh Pandey: I'm only staring to learn SAM/BAM flags. My understanding of flag 81 is this:
1+16+64 right? from SAM specs:
1 -> template having multiple segments in sequencing
16 -> SEQ being reverse complemented
64 -> the first segment in the template
How do you know that "mate forward stranded"?
Thanks
Because the flags including "segment unmapped" and "SEQ being reverse complemented" are not set true for the mate, I guess you could assume that means that the mate was mapped on to the forward strand. If the mate maps to the reverse strand then you should see value of 113 instead of 81. For the same reason a single end read that is mapped has a 0 flag.
I see, thanks for the answer