Filtering fastq vs BAM for fragment size range
1
0
Entering edit mode
5.9 years ago
rbronste ▴ 420

Wondering if you would get equivalent results in limiting fragment size to a range by doing it at the bowtie2 level (mapping only that fragment range indicated by the -I and -X flags) or filtering of the resulting BAM file after alignment of all fragments?

alignment bowtie2 bam fastq • 1.6k views
ADD COMMENT
0
Entering edit mode

Could you include a little more information about the type of sequencing you're doing and why you're only interested in a specific size range? If you're interested in multiple size ranges in something like ATAC-seq, I'd recommend mapping once then subsetting the bam file based on what sizes you want, that way you don't need to remap your fastq file. But I can't be sure that's what you're asking.

ADD REPLY
1
Entering edit mode
5.9 years ago
ATpoint 85k

Simply try it. Given that you asked how to subset a BAM by TLEN in a previous thread, here is a possible solution:

function TLEN {
  sambamba view -f sam -h $1 | \
    mawk -v LEN=$2 '{if ($9 <= LEN && $9 >= -(LEN) && $9 != 0 || $1 ~ /^@/) print $0}' | \
    sambamba view -S -f bam -h -o ${1%.bam}_isize${2}.bam /dev/stdin
}; export -f TLEN

## Subset bam for fragments with TLEN <= 100bp
TLEN in.bam 100
ADD COMMENT
0
Entering edit mode

This is a nice approach thanks, I guess my question was moreso if the two approaches (before and after alignment) would be equivalent in their final resulting bam file?

ADD REPLY
0
Entering edit mode

Yeah, but I cannot answer this, as I never tried. I would always go for the after-alignment step because that way you have all data available and can do multiple filtering approaches without having to re-align everything in case the filtering options during alignment were not optimal.

ADD REPLY

Login before adding your answer.

Traffic: 2939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6