Entering edit mode
12.4 years ago
Bnfoguy
▴
70
Hello Everyone,
I have been working with cancer exome data and I have two fastq files. I put these files into stampy for paired end mapping and got a sam file. Unfortunately, the stampy script crashed after running close to 10 hours. When I tried to convert the .sam file to bam for mat I get the following error:
CIGAR and Sequence length are inconsistent
Here are the offending lines:
ERR035488.960541 147 chr1 27205782 99 72M = 27205550 -304 TATGCCTCCTTGAGTGTCAGTGGCGTGATCTTGGCCCGGCTCACACCGGCCGGCAGGAAGTCTAGTAGGCAG 8<=21<?7A<<DA8>>@CEDE.D<DDB?@;BAC6;48EB@AE4>4@&FEB6FEFGFD1GFAC@DFAFBCBB> PQ:i:46 SM:i:96 UQ:i:0 MQ:i:96 XQ:i:15 NM:i:0
ERR035488.889173 147 chr1 24870728 99 72M = 24870687 -113 TCATTTTGTTTTGAGAAGTAGCAAAATTGTTTTTTCTACTAATTAGCCAGTTACTCTGAGATAAAAGTCACG <@<?@BCFFEGFHFHHIICIHEFHHGGFGGFGGGCGC?GCGFFCIGGHGHFC@G@GEGHGECFFEGE?D::? PQ:i:49 SM:i:96 UQ:i:0 MQ:i:96 XQ:i:34 NM:i:0
What can I do to solve this problem?
Best,
Bnfoguy
You say the script crashed, but still gave a SAM file? Did you look at the tail of the SAM file to see if it is truncated? if it is, you can get rid of the truncated line and convert what's left to BAM.
There seems to be no inconsistency with the CIGAR string and read length. I'd first check
seidel's
suggestion.Base quality string looks incorrect for first read (length inconsistency). Don't know if it's the reason why aligner is crashing.
Ashwin, are you sure you accounted for the
HTML
in the quality?this is a Biostar display error, the rendering protection thinks that some of the content in the quality may be an HTML tag and is therefore escaped as [HTML]
Think there should be a way to post plain texts in posts without any html-or any extra validation.
Either way the quality string looks incorrect.