filter out reads with specific properties from bam file
1
0
Entering edit mode
4.7 years ago
xiaoleiusc ▴ 140

Dear Colleagues,

Is there a way to filter out reads with specific properties from bam file, for example, after mapping my small RNA reads to a reference genome by Bowtie, I would like to filter out all the mapped reads that have a stretch of 3 G bases ("GGG") at the beginning of the reads from the bam file and I would like to have two output bam files in which one bam file having all the reads which are starting with "GGG" (the filtered out reads) and the other bam file having all the rest of reads.

If this could not be done, maybe I need to generate a bed file from the bam file and do the operation on the bed file?

I really appreciate it if you have any inputs.

Thanks ahead,

Xiao

RNA-Seq CLIP-seq • 1.1k views
ADD COMMENT
2
Entering edit mode
4.7 years ago

using samjdk: http://lindenb.github.io/jvarkit/SamJdk.html

java -jar dist/samjdk.jar -e 'return record.getReadString().startsWith("GGG");' --fail excluded.bam  --samoutputformat BAM  in.bam   > out.bam
ADD COMMENT
0
Entering edit mode

thanks a lot, samjdk works great in my case

ADD REPLY
0
Entering edit mode

if it worked, please, check the green mark on the left to validate+ close the question.

ADD REPLY
0
Entering edit mode

Do you mean the green checkmark ? I just clicked it. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 1606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6