HI biostars
I'm struggling a problem. I want to make count with htseq-count,
10 bam files, aligned with hisat2 (alignment rate 60% - 80%),
sorted by samtools,
both fna file and gff downloaded from ncbi (one species),
head of bam file:
@HD VN:1.0 SO:coordinate
@SQ SN:chrNC_037328.1 LN:158534110
@SQ SN:chrNC_037329.1 LN:136231102
@SQ SN:chrNC_037330.1 LN:121005158
@SQ SN:chrNC_037331.1 LN:120000601
@SQ SN:chrNC_037332.1 LN:120089316
@SQ SN:chrNC_037333.1 LN:117806340
@SQ SN:chrNC_037334.1 LN:110682743
@SQ SN:chrNC_037335.1 LN:113319770
@SQ SN:chrNC_037336.1 LN:105454467
the first row of gff file is:
NC_037328.1 RefSeq region 1 158534110 . + . ID=NC_037328.1:1..158534110;Dbxref=taxon:9913;Name=1;breed=Hereford;chromosome=1;gbkey=Src;genome=chromosome;isolate=L1 Dominette 01449 registration number 42190680;mol_type=genomic DNA;sex=female;tissue-type=left lung
code I used for htseq:
htseq-count -t gene -i gene file.sorted.sam annotation.gff > file.txt
htseq-count -s yes -t gene -i ID file.sorted.sam annotation.gff > file.txt
even --idattr=gene
and --idattr=exon
and -i Parent
-i transcript_id
but all of these just return 0 count or error.
would you please help me?
any answer is appreciated.
many thanks
Thank you Dear lieven.sterck. you solved my problem.
Hey, I am encountering a similar issue to this. How did you change your code?
you don't need to change the code. You need to change the data input files. As stated above, you need to make sure the naming of sequences is exactly the same in both files.