What featuretype I should select for featureCounts tool. I am generating the count matrix from bam files which is generated as part of RNA-Seq pipeline. The count matrix will be fed to DESeq2 tool. I used featueCounts with -t gene,mRNA,CDS,exon option. But when i searched I found that its enough to use 'gene,mRNA'. What features should I use?
Since you say this is RNAseq to be fed to DEseq2, most likely you want -t exon. This means you count only the reads mapping to exon. In "-t exon,CDS" CDS may be redundant since every CDS should also be listed as an exon in most GFF/GTF files out there, but I don't think it harms to have it. On the other hand, -t exon,mRNA,gene will count also reads mapped to introns which typically is not what you want.
You also need to tell featureCounts how to group exons within genes (or in more general terms: how to group features within meta-features). This is the purpose of the -g option which defaults to gene_id, so make sure every exon has a gene_id attribute.
If you are confused (it is confusing), post a few lines of you GFF/GTF file and perhaps some explanation of what you are trying to do if your analysis is not typical differential gene expression analysis with DESeq.
Thank yous so much. That information was really useful.
May I ask you some other doubts.
When i was processing the GFF files, there was some issues with the 9th column. Some of the columns didn;t has the 'gene_id' attribute. So I deleted those rows and computed the count matrix. Is it ok to do that?
I had another gff file which was not accepted by the the featureCouns tool. Agian the issue was the 9th column. My question is it ok to rmove all the details except 'gene_id' from the 9th column?
Thank yous so much. That information was really useful.
May I ask you some other doubts.
When i was processing the GFF files, there was some issues with the 9th column. Some of the columns didn;t has the 'gene_id' attribute. So I deleted those rows and computed the count matrix. Is it ok to do that?
I had another gff file which was not accepted by the the featureCouns tool. Agian the issue was the 9th column. My question is it ok to rmove all the details except 'gene_id' from the 9th column?