Hi all,
I have a BAM file of an alignment, and I need to remove the reads that are fully contained within certain intervals, provided as a GFF file (could be converted to BED).
I know bedtools intersect
has the -v
option and supports BAM files; however, running the command like this produces the result that is clearly wrong (I just looked at it in the IGV):
bedtools intersect -v -f 1.0 -a old.bam -b int.bed > new.bam
Can someone explain how does -f
option work here, and what would be a good way to achieve what I'm trying to do?
I can, of course, just do a direct overlap with -wao
style, BED-formatted output, then get the list of read names which overlap the intervals by exactly a read length, and then remove the reads from the BAM file by their name list. But this does seem cumbersome.
Thank you, as always.
On paper, the command you used looks correct to me.
-v -f 1
means that you select againstold.bam
reads fully mapped onint.bed
intervals.What exactly is wrong in IGV ?
Basically wrong reads are removed. You can see which ones overlapped the intervals, and that's not what was removed at all.
Does not answer your question but
samtools view
has--output-unselected
option that might be of use http://www.htslib.org/doc/samtools-view.htmlMaybe you can try pybedtools.