I'm analysing RNA-Seq data ( I should add that my bioinformatic knowledge is quite limited) and after aligning using TopHat I want to count reads in order to perform differential analysis of expression.
To achieve this I used both HTSeq-count and features counts and I have different GFF/GTF files that I use as annotation for the features.
I'm going to show you the example of the annotation file for the same gene
One of the files look like this (lets called it gene.gtf
):
scaffold_5 XENBASE_XT_7_1_GENE_MODEL_2 gene 72053579 72062232 . + . Name=t;ID=4652532;GeneName=T,%20brachyury%20homolog;GeneFunction=T-box%20transcription%20factor;GeneSynonyms=ntl,brachyury,Xbrachyury,bra,X-bra,Xbra
And the other like this (exon.gtf
):
scaffold_5 xenTro7_1_genome_v7_2 start_codon 72055079 72055081 0.000000 + . gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
scaffold_5 xenTro7_1_genome_v7_2 CDS 72055079 72055284 0.000000 + 0 gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
scaffold_5 xenTro7_1_genome_v7_2 exon 72053579 72055284 0.000000 + . gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
scaffold_5 xenTro7_1_genome_v7_2 CDS 72057175 72057439 0.000000 + 1 gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
scaffold_5 xenTro7_1_genome_v7_2 exon 72057175 72057439 0.000000 + . gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
scaffold_5 xenTro7_1_genome_v7_2 CDS 72058300 72058434 0.000000 + 0 gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
scaffold_5 xenTro7_1_genome_v7_2 exon 72058300 72058434 0.000000 + . gene_id "Xetrov72001125|Xetrov72001125|t"; transcript_id "Xetrov72001125|Xetrov72001125|t";
Basically the first file only has the whole gene feature while the other is divided by exons.
When I run HTSeq-count or featureCounts with the same bam file using either one or the other annotation file I get completely different results. With the gene.gtf
I get around 5000 reads for this gene while I get 0 reads for any feature using the exon.gtf
file.
Am I missing something really obvious?
Can anyone help me ? I want to count my miRNA by using featureCounts. I have a bam file after mapped with the hg38 by bowtie2. I tried a lot but I couldn't. Thanks in advance
You need to specify what exactly you tried to get meaningful help. Also don't post new questions as
ANSWERS
on existing threads.