Question

Bowtie2 mapped just 50% of the reads

0

Entering edit mode

5.8 years ago

silas008 ▴ 170

Hey, guys. I need some help with Bowtie2. It will be great if you can help me.

I am trying to align small RNA seq data to C elegans genome. The pre-processing is ok (good base quality, no adapters, reads are trimmed and etc). But in the alignment step bowtie2 only aligned about 50% of all reads. This is not common. I have used Bowtie2 many times to make this kind of alignments and it always worked well.

The read lenth is about 22nt and the command line used to this mapping was:

bowtie2 -p 4 --very-sensitive-local -x ce11  -U trimmed.fastq -S .sam

Do you know if there are big differences between Bowtie2 Versiosn 2 and Bowtie2 Version 3. I am using version 3 for the first time.

Thank you very much.

RNA-Seq • 2.5k views

ADD COMMENT • link updated 5.8 years ago by Biostar 20 • written 5.8 years ago by silas008 ▴ 170

3

Entering edit mode

Doesn't Bowtie2 manual say that Bowtie1 performs better with reads that are shorter than 50 nt? Your reads are barely longer than the initial seed length of --very-sensitive-local

Perhaps the reads that did not align are shorter than 20 nt?

ADD REPLY • link 5.8 years ago by 5heikki 11k

0

Entering edit mode

It is a good point. I will try it again with Bowtie1 to see what happen.

Thanks

ADD REPLY • link 5.8 years ago by silas008 ▴ 170

0

Entering edit mode

Have you checked for contamination?

ADD REPLY • link 5.8 years ago by Joe 21k

0

Entering edit mode

I think this is not the problem because STAR aligner have mapped more than 90% of the reads.

Another point is that I think I mapped this data about 1 year ago using Bowtie2.2.7 instead of the last release and it also worked well.

ADD REPLY • link 5.8 years ago by silas008 ▴ 170

0

Entering edit mode

Very probably right. With low mapping rates contamination is always my first suspicion though :P

ADD REPLY • link 5.8 years ago by Joe 21k

0

Entering edit mode

In general contamination, as rRNA, have millions of reads overexpressed. I have checked the overexpressed reads in fastqc and they are miRNA reads.

Do you think should I use a specific program for that?

Thanks

ADD REPLY • link 5.8 years ago by silas008 ▴ 170

0

Entering edit mode

Is your library stranded?

ADD REPLY • link 5.8 years ago by Matteo Schiavinato ★ 3.6k

score 1 · Answer 1 · 2019-01-18

If you do aligment of RNAs to genome of an eukaryotic organism, C. elegans in this case, you should use splicer aligner. STAR is a splicer aligner, meaning needs exon-intron boundaries to align RNA reads to genome. Hisat2 is another one.

You should use splicer aligner for this reason. This is why you get very low mapped ratio when you used Bowtie compared to STAR.