Cuffquant Error:- BAM error: CIGAR op has zero length
1
0
Entering edit mode
10.2 years ago
komal.rathi ★ 4.1k

Hi everyone,

I am trying to run cuffquant using the mm10 genome. Here is my command:

cuffquant \
  -o ./cuffquant_out/heart_adultmale \
  -p 4 \
  --frag-bias-correct mm10.fa \
  --multi-read-correct \
  --library-type fr-firststrand \
  Mus_musculus.GRCm38.75.protein_linc.gtf \
  ./gsnap_alignment/heart_adultmale_filter_sort.bam

[09:40:24] Loading reference annotation and sequence.
[09:40:53] Inspecting maps and determining fragment length distributions.
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1306:6005:41627)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1102:12593:36794)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1105:20343:33163)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1105:3726:86180)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1108:10176:6681)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1108:2918:39583)

My bam file (heart_adultmale_filter_sort.bam) is filtered using bamtools filter to check if "cigar" : "M" or "isProperPair" : "true", just like it is given here. The file is also sorted using samtools sort command. I don't know what is going on here. I used the same command for other data and it worked fine at that time.

Update: Thank you for the comments. Surprisingly, cuffquant is still running despite throwing out these errors, so I am letting it run. In the meanwhile I am looking at the alignments and trying to figure out what is wrong.

I tried to look at the alignments of the first read that throws out error:

samtools view heart_adultmale_filter_sort.bam | grep "D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769"

D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    163    chr2    77314799    3    3S68M28H    =    77314867    2095    GGTCATGAGTTTGTTTTTCAGCATCCGTGGGCTGCTCACAATCACCAGTAACTAGTTCCAGGGAATCGGCC    CCFFDFFFHFHHIHIIJIIJJJJJJJIGIJJJJJJJJJJJIJJIIJJJHBFHJIJIIJIJJIIIJJHFHFB    RG:Z:heart_adultmale    MD:Z:68    NH:i:2    HI:i:1    NM:i:0    SM:i:3XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+    CT:Z:2F3S68M28H0T1R14H12M1942N73M
D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    419    chr2    77314799    3    3S80M0N2M14S    =    77316821    2098    GGTCATGAGTTTGTTTTTCAGCATCCGTGGGCTGCTCACAATCACCAGTAACTAGTTCCAGGGAATCGGCCCCCTCTTCTTCTCTCACTAGTAGAGAAA    CCFFDFFFHFHHIHIIJIIJJJJJJJIGIJJJJJJJJJJJIJJIIJJJHBFHJIJIIJIJJIIIJJHFHFBCBBDDDDDDDDDDDDDDDCCC:CDDDCD    RG:Z:heart_adultmale    MD:Z:82    NH:i:2    HI:i:2    NM:i:0    SM:i:3    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+
D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    83    chr2    77314867    3    14H12M1942N73M    =    77314799    -2095    CCCTCTTCTTCTCTCACTAGTAGAGAAATCTTTTGAAGGACAGAAGAACTTCTGCTCGGAGGACCCGTTGAACTCTGAGTGTGGC    DDBCCCA@>=>FFFFDFEHEEHHHJIIIHGGJIHJJHIIIJIIIIIIGHHFIIHBFIJIGIHGJJJJJIIHGJHHHHHFFFFF@C    RG:Z:heart_adultmale    MD:Z:85    NH:i:2HI:i:1    NM:i:0    SM:i:3    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+
D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    339    chr2    77316821    3    26S73M    =    77314799    -2098    CCAGGGAATCGGCCCCCTCTTCTTCTCTCACTAGTAGAGAAATCTTTTGAAGGACAGAAGAACTTCTGCTCGGAGGACCCGTTGAACTCTGAGTGTGGC    BCCC>CBB@B@DBDDDBCCCA@>=>FFFFDFEHEEHHHJIIIHGGJIHJJHIIIJIIIIIIGHHFIIHBFIJIGIHGJJJJJIIHGJHHHHHFFFFF@C    RG:Z:heart_adultmale    MD:Z:73    NH:i:2    HI:i:2    NM:i:0    SM:i:3    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+    XA:Z:73,0

Another read which throws the error:

samtools view heart_adultmale_filter_sort.bam | grep "D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414"

D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414    163    chr2    77314841    40    38M0N25M36H    =    77316857    2079    CAGTAACTAGTTCCAGGGAATCGGCCCCCTCTTCTTCTCTCACTAGTAGAGAAATCTTTTGAA    =8B=BBAH?F?FFDHEH>DHG@CAEGDF@DHGB?DHB?<??<?8?99BBB8FHAHCEHAFCAD    RG:Z:heart_adultmale    MD:Z:40G1AGG1G1A1C1C1GTGG1AT1C    NH:i:1    HI:i:1    NM:i:15SM:i:40    XQ:i:40    X2:i:0    XO:Z:CU    XS:A:+    CT:Z:2F38M0N25M36H1953T1R36H63M
D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414    83    chr2    77316857    40    36H63M    =    77314841    -2079    GGACAGAAGAACTTCTGCTCGGAGGACCCGTTGAACTCTGAGTGTGGCATGCCACTCTCTTGT    @EEA;EAC@77;@3AGG@@?3?98@00@CFD99C?CGACFG@EBGGEBIHE;AABHDB;AD@@    RG:Z:heart_adultmale    MD:Z:63    NH:i:1    HI:i:1    NM:i:0    SM:i:40    XQ:i:40    X2:i:0    XO:Z:CUXS:A:+

Some reads that don't throw the error (with MAPQ=0):

samtools view heart_adultmale_filter_sort.bam | head

D5N1JJN1:213:C4HW1ACXX:6:1105:8310:40979    419    chr1    3018434    0    56M43H    =    3018490    115    CTTTTTTTATTCCTTCCTTGACCAAGGTATCATTGAGAAGAGTGTTGTTCAGTTTC    B@FFFFFHHHHHJJIJJJJJIJIJJJJCFGHJJJJJJJJJIIDFHIJIGHGIJJJJ    RG:Z:heart_adultmale    MD:Z:4C51    NH:i:28    HI:i:15    NM:i:1    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM
D5N1JJN1:213:C4HW1ACXX:6:1210:9786:30088    419    chr1    3018456    0    56M43H    =    3018512    113    CAAGGTATCATTGAGAAGAGTGTTGTTCAGTTTCCACGTGAATGTTGGCTTTCTAT    @;ADDDDHDDHDFDGFGABB<AHHIGEHIHHHGHIIICHGGGGI?BE@DBBFCHDB    RG:Z:heart_adultmale    MD:Z:56    NH:i:31    HI:i:16    NM:i:0    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM    PG:Z:A
D5N1JJN1:213:C4HW1ACXX:6:1105:8310:40979    339    chr1    3018490    0    40H59M    =    3018434    -115    CACGTGAATGTTGGCTTTCTATTATTTATGTTGTTATTGAGGATCAGCCTTAGTCCATG    HDJJJJJJJJJJJJGHGEJJJJJJJJJJJJIJJJJJJIJJJJIIJIJGHGHHFFFFFCC    RG:Z:heart_adultmale    MD:Z:40A18    NH:i:28    HI:i:15    NM:i:1    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM
D5N1JJN1:213:C4HW1ACXX:6:1210:9786:30088    339    chr1    3018512    0    42H57M    =    3018456    -113    TATTTATGTTGTTATTGAAGATCAGCCTTAGTCCATGGTGATCTGATAGGATGCATG    GGGGHGJIHGDF@EHGDFDGGBHCGHBHEIHCGHEHGHEGEBFCDFDDFDDDDDF@C    RG:Z:heart_adultmale    MD:Z:57    NH:i:31    HI:i:16    NM:i:0    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM    PG:Z:A
D5N1JJN1:213:C4HW1ACXX:6:1208:10624:67054    163    chr1    3019723    0    66M33H    =    3019789    133    GGGCGTCTCTTTCTTTAGATCTGGGAAGTTTTCTTCAATAATTTTGTTGAAGATGTTTGCTGGTCC    CCFFFFFHHHHHJJJJJJIJJJJJJJJJIIJJJJJJIJJJJIJJJJJJJHIJJJJJJJJJJJJJJJ    RG:Z:heart_adultmale    MD:Z:4A15T45    NH:i:71    HI:i:1    NM:i:2    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CMPG:Z:A    CT:Z:2F66M33H0T1R32H67M
D5N1JJN1:213:C4HW1ACXX:6:1208:10624:67054    83    chr1    3019789    0    32H67M    =    3019723    -133    TTTGAGTTGAAAATCTTCATTCTCATCCACTCCTATTATCTGTAGGTTTGGTCTTCTCATTGTGTCC    IIIJJJJIJJJIJIJJJIIIIIHIJJIGDJJJJJJJJIIIJJJJJJJJJJJJJJJHHHHHFFFFFCC    RG:Z:heart_adultmale    MD:Z:67    NH:i:71    HI:i:1    NM:i:0    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM    PG:Z:A

I don't think the MAPQ should be causing any problems.

mm10 bam ensembl75 gsnap cuffquant • 4.2k views
ADD COMMENT
1
Entering edit mode

this sounds like a problem in the BAM file, find the read that is reported and post the CIGAR string of it

ADD REPLY
0
Entering edit mode

how does your bam file looks like (samtools view)

ADD REPLY
0
Entering edit mode

Most probably the problem has to do something with the way GSNAP stores spliced alignments. Please paste all the records (alignments) for some of these reads.

ADD REPLY
0
Entering edit mode

Check if these are actually unmapped, or poorly mapped reads. If that's the case, just cut them out of the BAM altogether for the purposes of this analysis step.

ADD REPLY
1
Entering edit mode
10.2 years ago

My guess: I think the problem has to do with hard-clipped reads. If a read is hard-clipped the corresponding sequence should not appear in the "SEQ" column of the SAM file but Cuffquant doesn't know that. So it is throwing the error. Additionally, Cuffquant only inspects primary alignments with MAPQ >0 for the CIGAR inconsistency (my assumption), the error will only come from such reads or alignments.

Cuffquant simply ignores the secondary alignments and alignments with MAPQ of 0 and won't check them for the CIGAR inconsistency. This is the reason that reads with hard-clipping but MAPQ of 0 don't throw any error.

ADD COMMENT
0
Entering edit mode

But there are reads that are hard-clipped and do not cause any error like D5N1JJN1:213:C4HW1ACXX:6:1105:8310:40979 and D5N1JJN1:213:C4HW1ACXX:6:1210:9786:30088

ADD REPLY
0
Entering edit mode

But they have MAPQ of zero plus these are secondary alignments. Seems you just skimmed through my answer :-)

ADD REPLY
0
Entering edit mode

Oh sorry! I missed the second part of your answer. I just got really curious reading the first part and started looking for a pattern in the CIGAR strings. Thanks for pointing that out.

ADD REPLY

Login before adding your answer.

Traffic: 1952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6