Dear Members,
Is there a way I can removes reads associated with a region (chr, start, end) from a .bam file (RNASeq data) prior to the application of HTSeq?
I will greatly appreciate your feedback
Noushin
Dear Members,
Is there a way I can removes reads associated with a region (chr, start, end) from a .bam file (RNASeq data) prior to the application of HTSeq?
I will greatly appreciate your feedback
Noushin
bedtools intersect -abam file.bam -b filter.bed -v > filtered.bam
filter.bed should contain
chr start end
Just found, there is an option -U
in samtools view
. Use it like this:
samtools view input.bam -b -h -o output_inRegions.bam -U output_outRegions.bam -L Regions.bed
You'll want to use NGSUtil's bamutils tool, specifically with -excludebed
.
But, I'd recommend you don't :P
The BAM format is to store highly compressed alignment data. You should treat them like raw, virgin data, without normalization/filtering tweaks here and there to get it into shape.
All that kind of intersection stuff should be done on processed signal data - wigs and bedgraphs, etc - where its much easier to have multiple versions of things and to just dump it all and start afresh from the .bam if you have to.
Having said that, its your data, do what you like with it :)
If you are filtering your BAM for HTSeq, then you are doing extra work. You should just modify the GTF file that you are giving to HTSeq to exclude regions you do not want.
Using this QC package for RNAseq: http://rseqc.sourceforge.net
Split_bam.py
would do the splitting of bam files.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thats perfect solutions !! Super cool ! thanks !!!
Do you need to specify the positive and negative strand as well?