Hello!
I have paired end exome sequence data for 50 human genes only. I have mapped the exome data to hg19 assembly using bowtie2 with default parameters.
bowtie2 --end-to-end -x /path/to/human/genome -q -1 input1.fastq -2 input2.fastq -S output.sam
after the alignment, I used Picard samsort tool to sort the file and convert to bam file.
When I view the alignment, we found that there are some regions where only reverse reads are mapped and there were no forward reads.
The question is:
In theory, Bowtie should only consider forward and reverse reads simultaneously for the mapping. However, because of some quality issue, only reverse reads are aligned. Is this true?
Is this, alignment issue or trimming issue?
Is this a bigger problem for interpretation of mutations (SNPs)
which parameter/flag should I use to make a better alignment.
Thanks in advance
Adding to my previous question:
I ran Tophat2 with the default parameters and the alignment with the same input was better and IGV showed both reads.
In principle, tophat2 implements Bowtie alignment strategy still misses some read strands.
Any suggestion or possible answer would be nice to be able to explain the difference or missing reads.
I assume your input files are FASTQ, not BAM.
Anyway, to find out why this is happening, get the sequence of a few of the unaligned mates and blast them. That should give you a clue as to why they're not aligning.
Yes you are correct, the input file is fastq not bam. Sorry for the mistake
Thanks, Do you think this could be because of the alignment strategy?
My suspicion is on --end-to-end or local alignment strategy.
That's worth a shot. If some of the unmapped reads align reasonably with blast then give
--very-local-sensitive
a try.You have an extra hyphen in the
--end-to-end
option (at least the way you are typing it in these posts).Did you trim your data with a PE aware trimming program. It is possible that if you did the trimming of the PE files independently then you may have lost reads in one file (but not other). This would be a bad idea since aligners don't check to see if the reads in the input files are in proper order.
If above is true you should start this analysis over. Re-trim (this time using both files and PE aware trimmer) and then realign the data.
No this is not a trimming issue, I double checked this.
Currently, I prefer Tophat2 instead of bowtie2 for alignment and the issue is resolved. However, I find this very interesting as tophat2 uses bowtie2 for alignment.