htseq-count counts all reads as no_feature
1
0
Entering edit mode
9.4 years ago
manekineko ▴ 150

Hi,

I have run htseq like this:

htseq-count sorted.bam my.gff -i ID

and my GFF entries are in this format:

chr1      .       .       58812138        58812884        .       +       .       ID="something";

but at the end all reads fall into the no_feature?! So probably I miss something?

htseq-count • 2.5k views
ADD COMMENT
1
Entering edit mode
9.4 years ago

You need a feature type column (3rd column). Usually it's transcript/exon or something like that. Just name it whatever you want and specify the name with the -t flag.

So let's say you have:

chr1      .       myFeature       58812138        58812884        .       +       .       ID="something";

Run it with:

htseq-count sorted.bam my.gff -i ID -t myFeature
ADD COMMENT
0
Entering edit mode

Should it be unique or the 3rd column can be all the same and ID is unique? So I should keep the -i flag and use also -t flag?

ADD REPLY
1
Entering edit mode

3rd column should all be the same. It's just a feature type so htseq-count would know what features to count for. ID should be unique. You can use both -i and -t together.

ADD REPLY
0
Entering edit mode

Do you know why is working with a SAM that is 2G but when I run with larger SAM 16GM it gives me error:

53 GFF lines processed.
Error occured when reading beginning of SAM/BAM file.
[Exception type: StopIteration, raised in count.py:84]
ADD REPLY

Login before adding your answer.

Traffic: 1493 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6