Entering edit mode
8.8 years ago
Kssr
▴
110
I'm running HTSeq to get counts on a sam file following the below steps,
samtools sort -n tmp.bam nameSrt
samtools fixmate nameSrt.bam nameSrt.fixmate.bam
samtools view -h nameSrt.fixmate.bam >nameSrt.fixmate.sam
python -m HTSeq.scripts.count -f sam -r name -s $ss_library -a 10 -t exon -i $feature -m union nameSrt.fixmate.sam tmp.gff > tmp.HTSeq.counts
However, I get the below error:
Error occured when processing SAM input (line 77320836 of file nameSrt.fixmate.sam):
("Malformed SAM line: MRNM == '*' although flag bit &0x0008 cleared", 'line 77320836 of file nameSrt.fixmate.sam')
[Exception type: ValueError, raised in _HTSeq.pyx:1323]
77320836 is the last line in the file, and it doesn't have MRNM set to '*'. Here are the last 4 lines from the SAM file:
HISEQ:512:C8CNYACXX:6:2316:21397:43838 141 * 0 0 * * 0 0 GCAGAGAGCCTACCTGGATTGCACGTGCGTGGAGTGGTTCTGCAGATACCTGGAGAATGGGAAGGTGAAGTTGCNACGCACGGAAGCACCCAAGGGAAAT ==<AAA7A;>AA+77@BA7+3+,2<<?1?AA61?:>A080*=?==*)79)/=A(=7=7)=>))5;))).).665#(,((,,3;',)8(((+((((((+(( RG:Z:160108_SN172_0512_BC8CNYACXX_CTTGTA_L006
HISEQ:512:C8CNYACXX:6:2316:21397:70573 409 12 125396664 3 100M * 0 0 NCGGCTCCACTTCGAGAGTGATGGTNTTACCAGTCAGGGTCTTCACGAAGATCTGCATCCCACCTCTAAGACGGAGCACCAGGTGCAGGGTGGACTCTTT <<8+(:C><895@:>@@;>>;,,,(#;B>FFDDCA;FIGEHCC?IGGHGHCCFD;IGBGGDDHHGFCIFGIGHHHFHFFFE9IGGG@HFHFDDFFDD@@B CC:Z:= MD:Z:0T24C74 XG:i:0 NH:i:2 HI:i:0 NM:i:2 XM:i:2 XN:i:0 XO:i:0 CP:i:125396892 AS:i:-2 XS:A:- YT:Z:UU RG:Z:160108_SN172_0512_BC8CNYACXX_CTTGTA_L006
HISEQ:512:C8CNYACXX:6:2316:21397:70573 101 12 125396892 0 * = 125396892 0 NGGATGCCTTCCTTGTCTTGGATCTTTGCCTTGACATTCTCAATGGTGTCACTCGGCTCCACTTCGAGAGTGATGGTCTNNANAGTCAGGGTCTTCACGN #1+=ADDDHHDDFHIAHBHHIFHDHI,AAEHIIGFEEG?D:C@?GGBGH?FEF9DF(?<B838C@HFG(;DE?ECEH7;##(#(,5;=CCB?C?@>>@B1 RG:Z:160108_SN172_0512_BC8CNYACXX_CTTGTA_L006 MQ:i:3
HISEQ:512:C8CNYACXX:6:2316:21397:70573 1177 12 125396892 3 100M = 125396892 0 NCGGCTCCACTTCGAGAGTGATGGTNTTACCAGTCAGGGTCTTCACGAAGATCTGCATCCCACCTCTAAGACGGAGCACCAGGTGCAGGGTGGACTCTTT <<8+(:C><895@:>@@;>>;,,,(#;B>FFDDCA;FIGEHCC?IGGHGHCCFD;IGBGGDDHHGFCIFGIGHHHFHFFFE9IGGG@HFHFDDFFDD@@B MD:Z:0T24C74 XG:i:0 NH:i:2 HI:i:1 NM:i:2 XM:i:2 XN:i:0 XO:i:0 AS:i:-2 XS:A:- YT:Z:UU RG:Z:160108_SN172_0512_BC8CNYACXX_CTTGTA_L006
I'm trying to understand SAM format better but I still can't figure out a solution in this case. Appreciate any help to solve this.
Check previous posts to see if there is any solution given already: