Entering edit mode
6.6 years ago
firestar
★
1.6k
I normally use Limma for differential gene expression. This time, I am interested in significantly stably expressed genes rather than differentially expressed. So, I am looking for genes that change the least in expression.
I am comparing two tissues. And the normal assumption of DGE 'most genes are not DE' is already violated.
How would I go about doing this? Any resources on this?
How many replicates you have ?
Six replicates per tissue.
I am just thinking different ways of doing it.
You can calculate the variance of a gene (rowVars) across samples (tissues) and takes x genes that show least variance. Plot the distribution of variance and chose a cutoff.
Calculate the mean expression in one tissue and check if the gene has similar expression level in another tissue (considering +/- 1 standard deviation as cutoff).
Take genes that does not show a differential expression and the fold change is close to 0 ( in lines with what Devon suggested) but considering the variance of expression across replicates ( lower the variance the better). A gene might not be differentially expressed if it has high variance with in a tissue, which does not mean it shows consistent expression across tissues.
geek_y suggestion seems the way to go. In addition, you could also filter out weakly expressed probes and probes with a low signal-to-noise ratio, to select only "good" probes.
You are not looking for RT-qPCR reference genes, are you? There are plenty papers on this subject, and in fact they might point to some interesting leads for your problem.
Pretty much just to agree with geek and h.mon here. I would have suggested to look at genes with low variance and also those that have Z-scores less than absolute 1. The way that your normalise your read counts is crucial, though, as is implied by the other comment trail.
The assumption is less "most genes are not DE" as it is "there's no average global unidirectional change in signal". As long as the fold-changes are expected to average out to near 0 then you won't have a problem with limma.
So, is it just a matter of running a standard DE analysis and then look for genes with low FC and high p-values?
It's unclear if your real question is how to find stably expressed genes or if instead you're just trying to do that so you can normalize your samples appropriately.
It's not for normalisation. I actually want to find stably expressed genes. I want to find genes that do not change between tissues. One tissue is easy to get and other is hard. One of the questions is if the easy tissue can be use as a proxy for the hard-to-get tissue.
Intuitively, I would say no, one tissue can't be used as a proxy for another - they are different tissues after all. That said, some tissues could be similar enough to be considered as proxies for your undisclosed purposes.
If it's not for normalisation, I trust that you are nevertheless normalising your data in the normal way and adjusting for the fact that different tissues maybe sequenced to different read depths. This can easily be accounted for by including
tissue-type
in, for example, the DESeq2 design model. Limma can also manage it if you put it in the design model matrix.