Read count using htseq-count.
0
0
Entering edit mode
3.8 years ago
Saeed ▴ 10

Hi

I was trying to load .bam file (Star output) to do read count using htseq-count.

I get this error.

Error occured when processing GFF file (line 3012 of file /ref_genome/stre.gff):
  Feature rna-SCOt01 does not contain a 'gene_name' attribute
  [Exception type: ValueError, raised in count.py:76]

this is a my gff file

##gff-version 3
#!gff-spec-version 1.21
#!processor NCBI annotwriter
#!genome-build ASM20383v1
#!genome-build-accession NCBI_Assembly:GCF_000203835.1
##sequence-region NC_003888.3 1 8667507
##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=100226
NC_003888.3     RefSeq  region  1       8667507 .       +       .       ID=NC_003888.3:1..8667507;Dbxref=taxon:100226;Name=ANONYMOUS;gbkey=Src;genome=chromosome;mol_type=genomic DNA;old-name=Streptomyces coelicolor;serovar=A3(2)
NC_003888.3     RefSeq  sequence_feature        1       21653   .       +       .       ID=id-NC_003888.3:1..21653;Note=TIR-L. Left hand chromosome end terminal inveretd repeat.;gbkey=misc_feature
NC_003888.3     RefSeq  region  435     440     .       +       .       ID=id-NC_003888.3:435..440;gbkey=RBS
NC_003888.3     RefSeq  gene    446     1123    .       +       .       ID=gene-SCO0001;Dbxref=GeneID:1095448;Name=SCO0001;gbkey=Gene;gene_biotype=protein_coding;gene_synonym=SCEND.02c;locus_tag=SCO0001
NC_003888.3     RefSeq  CDS     446     1123    .       +       0       ID=cds-NP_624362.1;Parent=gene-SCO0001;Dbxref=Genbank:NP_624362.1,GeneID:1095448;Name=NP_624362.1;Note=SCEND.02c%2C unknown%2C doubtful CDS%2C len: 225aa;gbkey=CDS;locus_tag=SCO0001;product=hypothetical protein;protein_id=NP_624362.1;transl_table=11
NC_003888.3     RefSeq  region  1238    1243    .       +       .       ID=id-NC_003888.3:1238..1243;gbkey=RBS
NC_003888.3     RefSeq  gene    1252    3813    .       +       .       ID=gene-SCO0002;Dbxref=GeneID:1095447;Name=SCO0002;gbkey=Gene;gene_biotype=protein_coding;gene_synonym=SC8E7.42c,SCEND.01c,SCJ24.01
Assembly RNA-Seq gene • 1.2k views
ADD COMMENT
1
Entering edit mode

looks like you don't have "gene_name" in your gff file. If you look closer, you actually have it written as "Name". You can modify the attribute htseq-count is pointing to with -i flag. See the manual

ADD REPLY
0
Entering edit mode

I am sorry, I tried with "Name" but still not working, "locus_tag" works but gives zero count.

tput_basename.counts SCOr01 0 SCOr02 0 SCOr03 0 SCOr04 0 SCOr05 0 SCOr06 0 SCOr07 0 SCOr08 0 SCOr09 0 SCOr10 0 Any help is appreciated.

ADD REPLY
0
Entering edit mode

Try with ID then, it is a problem of the attribute not being recognized

ADD REPLY
0
Entering edit mode

I tried it, it doesn't work!

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLY

Login before adding your answer.

Traffic: 1510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6