Hi All
I getting what appears to be an anomalous alignment with bowtie2 (v2.3.4). Consider the following:
reference1
M00831:461:000000000-CP4BP:1:2105:10688:18697 99 AGRO_LBA4404_1|NODE_23 996 31 100M = 1213 267 CTCGACTGGCAATGAGAAGTTGCTCGCGCGATAGAACGTCGCGGGGTTTCTCTAAAAACGCGAGGAGAAGATTGAACTCACCTGCCGTAAGTTTCACCTC CCCCCGGGGGGGGGGGGGGGGGGFFGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGFGGGGGGGGGGGGGGGGGFGGGGFGGG AS:i:200 XS:i:200 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:100 YT:Z:CP
reference 2
M00831:461:000000000-CP4BP:1:2105:10688:18697 83 24653-1agrWset1 11508 44 100M = 11341 -267 GAGGTGAAACTTACGGCAGGTGAGTTCAATCTTCTCCTCGCGTTTTTAGAGAAACCCCGCGACGTTCTATCGCGCGAGCAACTTCTCATTGCCAGTCGAG GGGFGGGGFGGGGGGGGGGGGGGGGGFGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGFFGGGGGGGGGGGGGGGGGGCCCCC AS:i:200 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100
The difference in orientation aside, the only difference to my eye beween this read's alignment to two different data bases is the presence of the XS tag in the alignment to reference 1, an indication that the read aligns to more than one location in that reference. In other words it seems there is no difference in the quality of the alignment itself, as indicated by the AS and MD tags.
However, when I align this read with blast there is a difference in the alignments as show below. (sorry, I can't seem to get it to maintain the formatting for the alignments)
Database: reference 1
Query= M00831:461:000000000-CP4BP:1:2105:10688:18697
Length=100
Score E
Sequences producing significant alignments: (Bits) Value
seq1 185 1e-47
> seq1
Length=176344
Score = 185 bits (100), Expect = 1e-47
Identities = 100/100 (100%), Gaps = 0/100 (0%)
Strand=Plus/Plus
Query 1 ACGGATAAAGTTGTTGCACTCGAGCTAGGAGCAAGTGATTTTATCGCTAAGCCGTTTAGT 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 123352 ACGGATAAAGTTGTTGCACTCGAGCTAGGAGCAAGTGATTTTATCGCTAAGCCGTTTAGT 123411
Query 61 ACGAGAGAGTTTCTTGCACGCATTCGGGTTGCCTTGCGCG 100
||||||||||||||||||||||||||||||||||||||||
Sbjct 123412 ACGAGAGAGTTTCTTGCACGCATTCGGGTTGCCTTGCGCG 123451
Database: reference 2
Query= M00831:461:000000000-CP4BP:1:2105:10688:18697
Length=100
Score E
Sequences producing significant alignments: (Bits) Value
seq1 163 7e-44
> seq1
Length=14823
Score = 163 bits (88), Expect = 7e-44
Identities = 96/100 (96%), Gaps = 0/100 (0%)
Strand=Plus/Plus
Query 1 ACGGATAAAGTTGTTGCACTCGAGCTAGGAGCAAGTGATTTTATCGCTAAGCCGTTTAGT 60
|||||||||||||||||||||||||||||||||||||||||||||||||||||||| |||
Sbjct 11307 ACGGATAAAGTTGTTGCACTCGAGCTAGGAGCAAGTGATTTTATCGCTAAGCCGTTCAGT 11366
Query 61 ACGAGAGAGTTTCTTGCACGCATTCGGGTTGCCTTGCGCG 100
| ||||||||||| |||||||||||||||||||||||||
Sbjct 11367 ATCAGAGAGTTTCTAGCACGCATTCGGGTTGCCTTGCGCG 11406
In the blast result there is a 4 mismatch difference between the alignments of this read to reference 1 and 2, despite bowtie2 indicating both alignments are are equal and perfect.
Thanks for your help
Mark
Are the blast alignments against two different original references (which appear to be of different length based on info above)? Are you selecting only one top alignment? Are there other alignments reported?
The blast alignments are against to different blast dbs.
For reference 1 I am selecting the top hit (so that bowtie2 produces an XS tag for the reference 1 alignment seems correct). There is only 1 sequence in the blast db for reference 2.
The query sequences in the BLAST are not the same as in the sam file, did I miss something?