Entering edit mode
2.2 years ago
Marco Pannone
▴
810
Hey everybody
I am trying to find the explanation for a dilemma I came across during some RNA-seq data analysis.
I noticed that switching between -t exon
and -t gene
in featureCounts highly affects the output read counts for certain genes.
For example, some housekeeping genes gave the expected high read counts across all samples when using -t exon
, while when using -t gene
read counts went either to 0 or to very low values close to 0.
I can't fully understand why so, since I expect that -t gene
should also include exons in the annotation procedure.
I appreciate any comments in this regard.
Thanks!
I don't think any answer or comment here can substitute for you looking through the annotation file. Chances are that your assumption (genes include exons), while true in biological sense, doesn't hold true in your file. Don't know the reason, but that almost has to be the explanation. I suggest you look through the annotation file, and specifically search for the genes where you observed discrepancy. I suspect their
gene
definitions/boundaries to be defined incorrectly.Thanks for your reply. I have downloaded and tried two different .gtf files, one from Ensembl and one from Gencode, ending up having the same results mentioned in my post. I am definitely going to look through the annotation file, but it sounds a bit strange that both the annotation files (and from highly reliable sources) might have discrepancies.
When you have eliminated the impossible, whatever remains, however improbable, must be the truth.
You probably know the quote, or can find its origin easily. The problem must be with the operator, the program, or the files it uses. Since between the two runs the operator hasn't changed, and
featureCounts
is a mature and well-tested program, it would appear the culprit is in the annotation file. I suspect that some genes will have only exons defined, but not a complete gene boundary.I really like the quote! Just googled it, I did not know about it, but from now on I will definitely remember it.
You are right, I will look into the annotation files and find the reason for my concern. Thanks again for the time spent on my question, highly appreciated.