I'm trying to do read count using htseq-count
htseq-count --mode=union --stranded=no --idattr=gene_id -f sam ./filteredfirstZBND1X_gff.sam ./filteredsecZBND1X_gff.sam /home/madzays/data/finc h_data/ZF_transcriptome/taeGut2_gff/taeGut2.gff > ZBND1X_gff_counts.sam
but get the error
Error occured when processing GFF file (line 11 of file /home/madzays/data/finch_data/ZF_transcriptome/taeGut2_gff/taeGut2.gff):
Feature id1 does not contain a 'gene_id' attribute [Exception type:
ValueError, raised in count.py:77]
this is what my gff file (from NCBI) looks like
##gff-version 3
#!gff-spec-version 1.21
#!processor NCBI annotwriter
#!genome-build Taeniopygia_guttata-3.2.4
#!genome-build-accession NCBI_Assembly:GCF_000151805.1
##sequence-region NC_011462.1 1 118548696
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=59729
NC_011462.1 RefSeq region 1 118548696 . + . ID=id0;Dbxref=taxon:59729;Name=1;chromosome=1;gbkey=Src;genome=chromosome;isolate=Black17;mol_type=genomic DNA;sex=male
NC_011462.1 Gnomon gene 17853 28371 . + . ID=gene0;Dbxref=GeneID:100227449;Name=CLIC6;gbkey=Gene;gene=CLIC6;gene_biotype=protein_coding;partial=true;start_range=.,17853
NC_011462.1 Gnomon mRNA 17853 28371 . + . ID=rna0;Parent=gene0;Dbxref=GeneID:100227449,Genbank:XM_002186596.2;Name=XM_002186596.2;gbkey=mRNA;gene=CLIC6;partial=true;product=chloride intracellular channel 6;start_range=.,17853;transcript_id=XM_002186596.2
- Try to convert gff to gtf and feed it to htseq-count. try searching for gffread util.
- Shift to featurecounts.