Entering edit mode
2.3 years ago
sanatbhadsavle
•
0
Hi, I am trying to count miRNA reads from a set of sam files. I am using HTseq-count to do so. I downloaded the mouse miRNA gff file from mirbase - mmu.gff. This is the command I use
htseq-count L23.unaligned.sam mmu.gff3 -t miRNA_primary_transcript
I get this error -
> Error processing GFF file (line 14 of file mmu.gff3):
Feature MI0021869 does not contain a 'gene_id' attribute
[Exception type: ValueError, raised in features.py:387]
Has anyone else faced this? Is this a problem with the mirbase gff file? Thanks.
Based on the documentation it looks like
htseq-count
defaults to using the GFFgene_id
attribute, but if you look at the content ofmmu.gff3
none of the features contain agene_id
attribute, I assume this is intentional based on standards from mirBase set for annotating miRNAs.I suggest first determining which feature annotations to use from
mmu.gff3
for counting read overlap with miRNAs i.e. do you want to count overlaps with mature sequences or primary transcripts, then selecting for those from your gff file, then runninghtseq-count
with-i <id attribute>
where attribute would beID
,Alias
, orName
based on what's defined inmmu.gff3
.