Gene counts & expression level
1
1
Entering edit mode
5.1 years ago

Dear All,

Here's a naive question which I've been asked by plant biologists often -- Can you tell me if a gene X is expressed in my RNAseq data set? I quickly rush to critic this approach as a gene count of 10 and is expressed but that won't count (pun intended) unless that gene is differentially expressed as well.

In general, I discourage myself with this practice for reasons stated above but just want to know how to answer this question. I tried these approaches:

  • Generate TPMs by quantification methods such as Kallisto, Salmon etc.
  • VST/rlog normalized counts from DESeq

Plot heatmaps for both instances.

Best, Sandeep

RNA-Seq gene-count • 1.1k views
ADD COMMENT
1
Entering edit mode

Please see on of the many previous questions on this:

RNA-seq RPKM significance cut off

How Do You Justify Your Rna-Seq Expression Threshold (Fpkm/Rpkm) ?

identify high and low express genes from cufflinks output

What is the cutoff used for define high or low expression level of gene for survival analysis

Also please go through other questions one can find via a search engine of your choice. I am linking those not to discourage you but there is little point in repeating what others have written in them before. If you specific questions feel free to post it.

ADD REPLY
0
Entering edit mode

Thanks for the links, I think I've already read a few of them before posting.

I wish I could be specific but the question posed on a weekly basis to me is as nebulous as it gets.

ADD REPLY
1
Entering edit mode
5.1 years ago
h.mon 35k

With most regular datasets, we really can't say if a gene is expressed or not. I think to be confident of non-expression, one would have to use spike-ins, and also sequence the library to a very high depth, being sure the sequencing saturated discovery of lowly expressed transcripts.

What can be done, though, is filtering of uninformative genes due to low expression. DESeq2 performs independent filtering automatically, and latest edgeR versions have the filterByExpr() function to perform independent filtering as well - but it is not used by default. However, this filtering is geared towards removing low expressed genes with low statistical power to draw inferences from. There is a good video on StatQuest showing how independent filtering works.

ADD COMMENT

Login before adding your answer.

Traffic: 3597 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6