Hi,
I want to extract only those reads from bam file in a given region where both pairs are mapped properly and are not split.
Currently I am doing:
samtools view -h -f 2 -F 4 input.bam "chr:start-stop" | awk '$6 !~ /S/ || $1 ~ /@/' | samtools view -bS - > output.proper.nosplit.bam
Where awk '$6 !~ /S/ || $1 ~ /@/'
removes the split reads.
However, I also want the properly mapped pair of the split read to be removed.
I tried to run samtools view -h -f 2 -F 4 output.proper.nosplit.bam > output2.sam
, but it is still not removing the pair of the split read.
Any idea?
Thanks in advance
can we use 2048 flag (and/or 256) to filter out split reads from the region of interest?
Depends on which flag your aligner writes the split reads to. To remove just the secondary/supplementary alignments I would use the relevant SAM flag, to remove all records that are split reads I would check for the presence of the SA tag.
thanks @d-cameron