Question

getting bwa to ignore some parts of a read

0

Entering edit mode

6.6 years ago

Fedster ▴ 30

Hi All,

i have some (incredibly bad) Illumina HighSeq 4000 reads, where I know (thanks to FASTQC) that (1) the first ~5 bases are low quality and (2) the last ~50 bases are also quite bad.

I would like to tell BWA to ignore these parts of the reads without trimming them off. Once I have sorted bam files I use Stacks to call SNPs and I would like to have keep track of where the SNPs are in the fragment even though some part of the fragment is actually not used in the BWA alignment. How to go about obtaining this goal is not immediately obvious to me though.

bwa illumina BWA • 1.1k views

ADD COMMENT • link updated 6.6 years ago by swbarnes2 14k • written 6.6 years ago by Fedster ▴ 30

score 0 · Answer 1 · 2018-04-26

0

Entering edit mode

6.6 years ago

swbarnes2 14k

I don't know that bwa will do this. STAR should be able to use soft-clipping a read to get an alignment, meaning that the unused portion of the read remains in the .bam file

ADD COMMENT • link 6.6 years ago by swbarnes2 14k

0

Entering edit mode

Sadly my sequence is genomic DNA from GBS not RNA-seq, would STAR work?

ADD REPLY • link 6.6 years ago by Fedster ▴ 30

0

Entering edit mode

I think it will still work. The virtue of STAR is it aligns to a whole genome, but understands that many reads need giant gaps to align properly. So aligning without big gaps is easier than what it is used to. You can probably run it with no gtf, or maybe make a fake gtf where one chromosome = one giant unspliced gene. That might be quicker than trying to bend bwa to do something it's not supposed to do.

ADD REPLY • link 6.6 years ago by swbarnes2 14k