Entering edit mode
9.8 years ago
AW
▴
350
I would be very grateful if someone could answer my question about sam files. I have mapped paired end reads so they must both map concordantly using Bowtie.
However, when I look at the sam output file, even though I see that all reads have mapped concordantly (YT:Z:CP) it varies whether the alignment for both reads is reported or only one pair is reported. This is illustrated in the example below where only the alignment for one read of HISEQ2500-09:128:H9FFTADXX:2:1102:13154:48635
is shown but both are reported for HISEQ2500-09:92:H8PJKADXX:1:1214:1438:92949
.
What is causing this?
Thanks.
grep "scaffold100060" output.sam
HISEQ2500-09:92:H8PJKADXX:1:1214:1438:92949 83 scaffold100060 208 42 100M = 19 -289 GGATTTTAAAGCCACTCTAAGTCACTTTTTCTGGCATAAAAAACTCCAACAAATAACTGGTCAAGAAATTTGTAATCACTTTTATAAATTAGTCCAACAG DDEDDDDDDDDDEEEEEDFFFFEHHHHIIIJJJIIJJIJJJIHJJJJIIJJJJIJIJJJJJJJJJJJHHCJJIIHHFEJJJJJJIJJHHHHHFFFFFCCC AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z:CP
HISEQ2500-09:92:H8PJKADXX:1:1214:1438:92949 163 scaffold100060 19 42 100M = 208 289 AATTAATCTGCTTTGGACTGAAAAGAACTTCAGTCAGCATAATGCGGCTGGATGCAACATAATTTCCAGATTTAAAGTATCTACTAAAGTTTTAACAATC BBBFFFFFHHHHHJJJJIIJJJJJIGJJJJIHJJJJJJJJJJJJIJJJJJJGIIJJJJIJJJIJJHHHHGHHFFFFFFFFEEECEDEDDEDEDDDDDDDD AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z:CP
HISEQ2500-09:128:H9FFTADXX:2:1102:13154:48635 99 scaffold100060 60 40 100M = 260 306 ATGCGGCTGGATGCAACATAATTTCCAGATTTAAAGTATCTACTAAAGTTTTAACAATCCCATGTAAAGCACCTAATTTACTGAATTGTAAATTAATTGT ??@DD:?D?<#22ABFFGFFEBFGIEFEGFFIIFEG:BGFIEFECFCF?B@FECFGCFEIFIFI@DFIBEEFEDDDDDDDAAAABBB>@DDB@BB@>>@; AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:-34 YT:Z:CP
HISEQ2500-09:128:H9FFTADXX:2:2212:6463:28248 99 scaffold100060 55 23 100M = 350 387 GCATAATGCGGCTGGATGCAACATAATTTCCAGATTTAAAGTATCTACTAAAGTTTTAACAATCCCATGTAAAGCACCTAATTTACTGAATTGTAAATTA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJGHJJJJJJJJJJJJJJJJJJJIJJJJIJHHHHHHHFFFFFEEEDEEEDDDEDDEEFFED AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:-55 YT:Z:CP
Do you have more than 4 alignments output by grep? You're correct that the mates should all be there in each of those cases. If they're not, then that's a bug in the aligner.
Hi,
Thanks for your help! I had a few more alignments output by grep but they just showed the same pattern. I'm using Bowtie 2 version 2.2.4. Have you come across this problem before?
Not that I've seen, but I honestly haven't explicitly checked. One of my programs uses bowtie2 internally, so I'll add a check for this. Can you post the exact command that you ran, just in case there's some odd combination of options needed to cause this behavior?
make sure to be grepping for the read name and not scaffold name, the alignments could still be there perhaps are just not reported consecutively