Question

htseq-count returns 0 count for all

0

Entering edit mode

4.1 years ago

ziziqolo ▴ 10

HI biostars

I'm struggling a problem. I want to make count with htseq-count,

10 bam files, aligned with hisat2 (alignment rate 60% - 80%),

sorted by samtools,

both fna file and gff downloaded from ncbi (one species),

head of bam file:

@HD     VN:1.0  SO:coordinate
@SQ     SN:chrNC_037328.1       LN:158534110
@SQ     SN:chrNC_037329.1       LN:136231102
@SQ     SN:chrNC_037330.1       LN:121005158
@SQ     SN:chrNC_037331.1       LN:120000601
@SQ     SN:chrNC_037332.1       LN:120089316
@SQ     SN:chrNC_037333.1       LN:117806340
@SQ     SN:chrNC_037334.1       LN:110682743
@SQ     SN:chrNC_037335.1       LN:113319770
@SQ     SN:chrNC_037336.1       LN:105454467

the first row of gff file is:

NC_037328.1 RefSeq  region  1   158534110   .   +   .   ID=NC_037328.1:1..158534110;Dbxref=taxon:9913;Name=1;breed=Hereford;chromosome=1;gbkey=Src;genome=chromosome;isolate=L1 Dominette 01449 registration number 42190680;mol_type=genomic DNA;sex=female;tissue-type=left lung

code I used for htseq:

htseq-count -t gene -i gene file.sorted.sam annotation.gff > file.txt

htseq-count -s yes -t gene -i ID file.sorted.sam annotation.gff > file.txt

even --idattr=gene and --idattr=exon and -i Parent -i transcript_id but all of these just return 0 count or error.

would you please help me?

any answer is appreciated.

many thanks

RNA-Seq htseq-count • 1.5k views

ADD COMMENT • link updated 14 months ago by lieven.sterck 15k • written 4.1 years ago by ziziqolo ▴ 10

score 3 · Accepted Answer · 2021-02-24

3

Entering edit mode

4.1 years ago

lieven.sterck 15k

You have to make sure the sequence naming is exactly the same in your GFF as in you fasta file you use to align it to.

In you specific case : NC_037328.1 != chrNC_037328.1

ADD COMMENT • link 4.1 years ago by lieven.sterck 15k

0

Entering edit mode

Thank you Dear lieven.sterck. you solved my problem.

ADD REPLY • link 4.1 years ago by ziziqolo ▴ 10

0

Entering edit mode

Hey, I am encountering a similar issue to this. How did you change your code?

ADD REPLY • link 16 months ago by Bjorn • 0

0

Entering edit mode

you don't need to change the code. You need to change the data input files. As stated above, you need to make sure the naming of sequences is exactly the same in both files.

ADD REPLY • link 14 months ago by lieven.sterck 15k