Entering edit mode
9.4 years ago
Dejian
★
1.3k
When I apply htseq-count to bam files generated from STAR, I encounter the same error message repeatedly (see examples below). I extracted the corresponding line from bam and found that they all contained soft clipping. I thought htseq-count could correctly handle soft clipping (http://www-huber.embl.de/users/anders/HTSeq/doc/alignments.html#cigar-strings). Does anybody encounter the same problem? And how do you solve the problem?
EXAMPLE 1:
Error occured when processing SAM input (record #66220 in file ../SRR1974799.sorted.dedup.bam):
unsigned byte integer is less than minimum
[Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974799.sorted.dedup.bam | sed -n '66220p'
SRR1974799.1020660.1 147 chr1 1549493 255 66M9S = 1549429 -130 TGAACAGCAGGTACTCAATCATGAAGAGCTAAGCCTGATTTCATCACGACAGCTGTGAAAGTTGCACCCATGTAC <FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFAAAAA RG:Z:SRR1974799 NH:i:1 HI:i:1 jI:B:i,-1 jM:B:c,-1 nM:i:0 AS:i:139
EXAMPLE 2:
Error occured when processing SAM input (record #174801 in file ../SRR1974808.sorted.dedup.bam):
unsigned byte integer is less than minimum
[Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974808.sorted.dedup.bam | sed -n '174801p'
SRR1974808.1497057.1 83 chr1 40149760 255 67M8S = 40148296 -1531 CCGTTCTTGTCGAAGGTGCGGAAAGCGTGCTGCGCGAACTTGGAGGCGTCGCCGTAGGGGAAGAACTTGATGTAG FFFAAFFFFFFFFF7F.FFFFFF7FFF)FFFFFFF<FFF<7FFFFFFF<FFFFAF<FFFFFFFAFFFFFFAA<AA PG:Z:MarkDuplicates RG:Z:SRR1974808 NH:i:1 HI:i:1 jI:B:i,-1 jM:B:c,-1 nM:i:0 AS:i:139
EXAMPLE 3:
Error occured when processing SAM input (record #77098 in file ../SRR1974802.sorted.dedup.bam):
unsigned byte integer is less than minimum
[Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974802.sorted.dedup.bam | sed -n '77098p'
SRR1974802.1214351.1 99 chr1 16045055 255 13S62M = 16046228 1221 GAGTACATGGGAAGATCACCTGACGCTCTTCCTGACATTGGTGTCCGGGCTAGAGTTCATTCGTTCCGAGCTGGA A)AAA)AFA.FF)FFF<7.)FFF.F<FFFF..F..F)FA.)F<7FA<F))F<FFFAFF.FFF<F)FA.<FFF7FF PG:Z:MarkDuplicates RG:Z:SRR1974802 NH:i:1 HI:i:1 jI:B:i,-1 jM:B:c,-1 nM:i:2 AS:i:103
EXAMPLE 4:
Error occured when processing SAM input (record #153985 in file ../SRR1974806.sorted.dedup.bam):
unsigned byte integer is less than minimum
[Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974806.sorted.dedup.bam | sed -n '153985p'
SRR1974806.735761.1 99 chr1 45469184 255 68M7S = 45469375 380 TGTCAGTGTCGATGGCCACGCAGTTGTAGGCCGCATAGCGGAGCTTCTCCTCGCATACCTTGGCACTGGCATAGT <<AAAFFFFFFFFFFF<)FFFFF<FAFFFAFAFFFFFFFFFFFFAAFF.FAF<<F7<AFFFFFF.<FFFFA7FFA PG:Z:MarkDuplicates RG:Z:SRR1974806 NH:i:1 HI:i:1 jI:B:i,-1 jM:B:c,-1 nM:i:0 AS:i:141 XS:A:-
This is actually a pysam error that I've seen a few others run into (though with different programs). What version of pysam do you have installed and can you try upgrading it?