Question

htseq-count counts all reads as no_feature

0

Entering edit mode

10.2 years ago

manekineko ▴ 150

Hi,

I have run htseq like this:

htseq-count sorted.bam my.gff -i ID

and my GFF entries are in this format:

chr1      .       .       58812138        58812884        .       +       .       ID="something";

but at the end all reads fall into the no_feature?! So probably I miss something?

htseq-count • 2.7k views

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by manekineko ▴ 150

Ram · Answer 1 · 2015-06-27

1

Entering edit mode

10.2 years ago

Damian Kao 16k

You need a feature type column (3rd column). Usually it's transcript/exon or something like that. Just name it whatever you want and specify the name with the -t flag.

So let's say you have:

chr1      .       myFeature       58812138        58812884        .       +       .       ID="something";

Run it with:

htseq-count sorted.bam my.gff -i ID -t myFeature

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by Damian Kao 16k

0

Entering edit mode

Should it be unique or the 3rd column can be all the same and ID is unique? So I should keep the -i flag and use also -t flag?

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by manekineko ▴ 150

1

Entering edit mode

3rd column should all be the same. It's just a feature type so htseq-count would know what features to count for. ID should be unique. You can use both -i and -t together.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by Damian Kao 16k

0

Entering edit mode

Do you know why is working with a SAM that is 2G but when I run with larger SAM 16GM it gives me error:

53 GFF lines processed.
Error occured when reading beginning of SAM/BAM file.
[Exception type: StopIteration, raised in count.py:84]

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by manekineko ▴ 150