I just ran my first successful (I think) DNA alignment operation with `bowtie2` on some paired end data (namely SRR390728) and these were the results I got:
7178576 reads; of these: 7178576 (100.00%) were paired; of these: 7177909 (99.99%) aligned concordantly 0 times 225 (0.00%) aligned concordantly exactly 1 time 442 (0.01%) aligned concordantly >1 times ---- 7177909 pairs aligned concordantly 0 times; of these: 1624832 (22.64%) aligned discordantly 1 time ---- 5553077 pairs aligned 0 times concordantly or discordantly; of these: 11106154 mates make up the pairs; of these: 1231919 (11.09%) aligned 0 times 1884213 (16.97%) aligned exactly 1 time 7990022 (71.94%) aligned >1 times 91.42% overall alignment rate
So my first impression was "WOOOOO!!! AWESOME", but on further inspection I saw:
7177909 (99.99%) aligned concordantly 0 times
And so my second impression was "hmmmm.......", and now here I am. From the Bowtie 2 manual I found that:
- "A pair that aligns with the expected relative mate orientation and with the expected range of distances between mates is said to align "concordantly". If both mates have unique alignments, but the alignments do not match paired-end expectations (i.e. the mates aren't in the expected relative orientation, or aren't within the expected distance range, or both), the pair is said to align "discordantly"."
Now I'm confused about how this information applies to my data. Should I interpret this as a problem with my dataset? Is my alignment reliable? What else should this lead me to consider?
This is an RNAseq dataset. Presumably you meant to align with hisat or tophat.
Would that be the likely explanation behind this issue?
That'll at least explain a large chunk of the problem. Did you trim this data at all?
I didn't trim it - don't really know what anything is at this point, but I let it run with the
--local
option overnight as I understand that is more lenient about the start and end of reads (which I assume is what was in subject when you asked about trimming) and now we've got 0.14% concordant alignments :P Not so much better.I'm going to try and run it through Tophat now.
Is "trimming" something that you recommend should always be done? I'm still trying to figure out a kind of standard procedure so it'd help to know what other people recommend :)