I am trying to use HTseq to generate count data from a Pseudomonas Aeruginosa rna-seq sample. I end up with all of my output going to no feature.
__no_feature 6171314
__ambiguous 0
__too_low_aQual 0
__not_aligned 0
__alignment_not_unique 478089
When I look at my gtf file:
##gff-version 2
##source-version rtracklayer 1.38.3
##date 2018-08-06
NC_002516 PseudoCAP region 1 6264404 . . . ID "NC_002516"; Name "Pseudomonas aeruginosa PAO1 NC_002516,complete genome."; Dbxref "refseq:NC_002516";
NC_002516 PseudoCAP gene 483 2027 . + 0 ID "gene134012"; Dbxref "GeneID:878417"; Alias "PA0001"; name "dnaA";
NC_002516 PseudoCAP CDS 483 2027 . + 0 ID "CDS134013"; name "chromosomal replication initiator protein DnaA"; Parent "gene134012"; locus "PA0001"
NC_002516 PseudoCAP CDS 2056 3159 . + 0 ID "CDS134019"; name "DNA polymerase III,beta chain"; Parent "gene134018"; locus "PA0002"
NC_002516 PseudoCAP gene 2056 3159 . + 0 ID "gene134018"; Dbxref "GeneID:879244"; Alias "PA0002"; name "dnaN";
NC_002516 PseudoCAP CDS 3169 4278 . + 0 ID "CDS134021"; name "RecF protein"; Parent "gene134020"; locus "PA0003"
NC_002516 PseudoCAP gene 3169 4278 . + 0 ID "gene134020"; Dbxref "GeneID:879229"; Alias "PA0003"; name "recF";
This gtf file was generated using rtracklayer from the gff file
##gff-version 3
##sequence-region chromosome 1 6264404
chromosome PseudoCAP region 1 6264404 . . . ID=chromosome;Name=Pseudomonas aeruginosa PAO1 chromosome, complete genome.;Dbxref=refseq:NC_002516
chromosome PseudoCAP gene 483 2027 . + 0 ID=gene134012;Alias=PA0001;name=dnaA;Dbxref=GeneID:878417
chromosome PseudoCAP CDS 483 2027 . + 0 ID=CDS134013;Parent=gene134012;locus=PA0001;name=chromosomal replication initiator protein DnaA;
chromosome PseudoCAP CDS 2056 3159 . + 0 ID=CDS134019;Parent=gene134018;locus=PA0002;name=DNA polymerase III, beta chain;
chromosome PseudoCAP gene 2056 3159 . + 0 ID=gene134018;Alias=PA0002;name=dnaN;Dbxref=GeneID:879244
chromosome PseudoCAP CDS 3169 4278 . + 0 ID=CDS134021;Parent=gene134020;locus=PA0003;name=RecF protein;
chromosome PseudoCAP gene 3169 4278 . + 0 ID=gene134020;Alias=PA0003;name=recF;Dbxref=GeneID:879229
The GFF file and accompanying fasta file (Pseudomonas_aeruginosa_PAO1_107, both from ncbi) were used with Star 2.5.3 to generate the bam file. I changed the chromosome name from chromosome to NC_002516 so it would match the fasta file for star indexing. I used picard to deduplicate the bam file.
The GFF/GTF lists all entries as gene CDS or rna.
Does anyone have any suggestions on what I might be doing wrong or if there is some error in my formatting of the gtf file that may be causing this problem.
Thanks in advance for any advice.
Still working on the solution, thank you both very much.
If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.