Hi guys, I have a questions about Bowtie2 right now.
My reference include several sequences in different length (imagine trinity assembly, several 'genes' and their variables). I tried to bowtie2-build all sequences separately then map my reads to it (alignment rate A1, A2, A3, A4) and also tried to bowtie2-build all sequences in one big reference (alignment rate B), the results is different (A1+A2+A3+A4 doesn't equal to B and way smaller than B) . I wonder if anyone have encountered this and maybe can help me explain what was going on?
All mapping is using default setting, and alignment rate is from the alignment summary.
Extra question, how reliable and practical is alignment summary output of Bowtie2?
Thanks you. Emily
Sorry to clarify, I used the same set of reads for both mapping.
If you align reads individual to a genome they will map to their best sites (above a certain threshold).
If you align reads to individual genomes they are going to align potentially up to four times A1+A2+A3+A4 but if you align to B they may only align once.
So it make sense that the numbers are very different.
What would be scary is if somehow the alignment rate was higher for any individual A[1-4] genome than to B. (assuming you are allowing multi-mappers).