Newbie here. I am trying to use featureCounts to assign reads to features (exons). My reference is E. Coli. I downloaded the gtf file from here: https://www.ncbi.nlm.nih.gov/genome/167?genome_assembly_id=161521 I am noticing that my % assigned is very low after running featureCounts. However, I also noticed in the gtf file there are only ~300 exons listed. Is this correct for E. coli? I can't find any resources online to help me with this. Here is the featureCounts command I run:
featureCounts -p -t exon -g gene -T 16 -s 2 -a GCF_000005845.2_ASM584v2_genomic.gtf -o counts.txt input_file.bam
Maybe I am not downloading the correct gtf file but I need this exact strain.
Wow this helps a lot thanks. I guess one more question I have is that I am doing a differential expression analysis and I have read that you should not specify "CDS" when doing DE. Is this true?
technically true, but for procaryotes in general and in this particular case where exons are not even annotated, and where gene and CDS all the share the same coordinates it makes no difference.
You could also use gene instead of CDS or exon.
Okay great, thanks very much for your help!