How to remove reads associated with a specific region from bam file?
5
4
Entering edit mode
9.6 years ago

Dear Members,

Is there a way I can removes reads associated with a region (chr, start, end) from a .bam file (RNASeq data) prior to the application of HTSeq?

I will greatly appreciate your feedback

Noushin

alignment RNA-Seq samtools • 13k views
ADD COMMENT
9
Entering edit mode
8.1 years ago
bedtools intersect -abam file.bam -b filter.bed -v > filtered.bam

filter.bed should contain

chr    start     end
ADD COMMENT
0
Entering edit mode

Thats perfect solutions !! Super cool ! thanks !!!

ADD REPLY
0
Entering edit mode

Do you need to specify the positive and negative strand as well?

ADD REPLY
5
Entering edit mode
8.7 years ago
sunhanice ▴ 240

Just found, there is an option -U in samtools view. Use it like this:

samtools view input.bam -b -h -o output_inRegions.bam -U output_outRegions.bam -L Regions.bed
ADD COMMENT
1
Entering edit mode

Just to clarify, I used

samtools view in.sorted.bam -b -h -o inRegions.bam -U outRegions.bam "chr:start-stop"...

So here the -o file excludes the regions chr:start-stop and has the rest, but the -U file only retains the chr:start-stop? Thank you for your help!

ADD REPLY
4
Entering edit mode
9.6 years ago
John 13k

You'll want to use NGSUtil's bamutils tool, specifically with -excludebed.

But, I'd recommend you don't :P

The BAM format is to store highly compressed alignment data. You should treat them like raw, virgin data, without normalization/filtering tweaks here and there to get it into shape.

All that kind of intersection stuff should be done on processed signal data - wigs and bedgraphs, etc - where its much easier to have multiple versions of things and to just dump it all and start afresh from the .bam if you have to.

Having said that, its your data, do what you like with it :)

ADD COMMENT
3
Entering edit mode
8.7 years ago
igor 13k

If you are filtering your BAM for HTSeq, then you are doing extra work. You should just modify the GTF file that you are giving to HTSeq to exclude regions you do not want.

ADD COMMENT
2
Entering edit mode
8.1 years ago
Ron ★ 1.2k

Using this QC package for RNAseq: http://rseqc.sourceforge.net

Split_bam.py would do the splitting of bam files.

ADD COMMENT

Login before adding your answer.

Traffic: 2414 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6