Question

Extracting the best reads that align multiple times

0

Entering edit mode

7.6 years ago

ioannis ▴ 50

Hello community,

I am using bowtie2 to align sequences to a reference genome. The results are quite disappointing: 48% of the reads align exactly 1 time and 44% of the reads aligned more than once.

I have single-end reads 55-70bp long. The reference genome is the OreoNil2 (Oreochromis niloticus).

I am not sure about this, but I guess each sequence that aligns multiple times has different score according to how good is the alignment on the reference genome. I would like to extract in a new sam file the reads that align only once (48%) and the reads with the best score among the reads that align multiple times.

Does anybody knows if this is possible and how to do something like that? Do I introduce any bias if I pick those reads?

Thanks in advance!

alignment • 2.1k views

ADD COMMENT • link 7.6 years ago by ioannis ▴ 50

0

Entering edit mode

Might be worth trying bwa and comparing results. If these are paired end reads I would expect a smaller proportion of multiple mappings.

ADD REPLY • link 7.6 years ago by abascalfederico ★ 1.2k

0

Entering edit mode

Once you pick a subset of reads with higher scores, yes, you will introduce bias. What is your ultimate goal?

ADD REPLY • link 7.6 years ago by Brian Bushnell 20k

0

Entering edit mode

My goal is to get as much alignments as I can but as it seems, I have to use less than 50% of my total reads. I have hydroxymethylation data and I need coverage, as much as I can get. I will try different aligners just to see if I get better results. However, I think that bowtie2 is a quite good aligner. So, I do not hope for a miracle. Thank you for your input!

ADD REPLY • link 7.6 years ago by ioannis ▴ 50

0

Entering edit mode

You'll typically get a much higher alignment rate with BBMap compared to Bowtie2, when using data with low identity to the reference. Particularly, you can add the flag "slow" or "vslow", and use a shorter kmer length such as 11, to increase the alignment rate even more.

ADD REPLY • link 7.6 years ago by Brian Bushnell 20k