How to evaluate gene expression
1
0
Entering edit mode
4.1 years ago
Lorenzo ▴ 10

Hi Biostars community,

Let me explain my situation: I have to determine the expression of different genes in different tissues. Each tissue has a biological replicate coming from another individual (the only difference is 3 months in term of age). These two individual are healthy so my expectation is that there is no a significant difference in gene expression between them. So it is not useful in this case to perform DEA with DESeq and so on.

I calculated TPM values from raw reads with Salmon. I know that there are a lot of posts about this but in each post there are different information but none of them have a definitive answer.

My question is: in my situation is it fine to utilize TPM discarding all summarized gene counts other than those with raw counts > 10 and consider expressend genes with row counts >10? Or it is better to move toward another normalization method like zFPKM to identify the threshold between active and background gene expression?

I'm sorry if this post sounds repetitive with others. I really appreciate any suggestions.

Thank you for your time!

RNA-Seq • 1.5k views
ADD COMMENT
0
Entering edit mode

If you are working with different tissues, why can't you do a differential expression analysis? Despite having 2 healthy individuals, you have 2 different tissues, which are 2 different conditions. This is what DESeq2 is actually doing, comparing different conditions.
The issue of replicates coming from 2 different individuals can be easily addressed with a batch effect normalization (you have how to do it in this post).

ADD REPLY
0
Entering edit mode

Actually I'm not working with 2 different tissues but with a lot. For each tissue of an individual I have the biological replicate of the same tissue in the other individual. I don't know if I have a consistent batch effect because the procedures (e.g. poly-A selection, library preparation etc.) were performed the same day. Beside this I could have, of course, a strong batch effect and I'm working to construct a PCA plot to establish it and eventually correct it.

Anyway with DESeq is possible to set a tissue as reference and perform the DEA versus the other tissues, right? So your advice is to set every time a tissue as reference and then perform DEA against the other tissues?

ADD REPLY
0
Entering edit mode

Exactly! In a 2 tissues example, being A the reference tissue:
You will find that the vast majority of genes won't change, some of them will have a positive log2FoldChange (B specific) and some will have negative log2FoldChange (A specific).

ADD REPLY
0
Entering edit mode

Probably it is a stupid question so sorry. But is it possible to compare A vs B,C,D and so one in the same time or is it necessary to compare A vs B, then A vs C and so on?

ADD REPLY
0
Entering edit mode

You can do all the comparisons at the same time, read in DESeq2 vignette about the design if you have any doubts. As you are only comparing 1 variable (tissue) it will (should) be easy. Good luck!

ADD REPLY
0
Entering edit mode
4.1 years ago

Not sure what you mean by "active" and "background" gene expression. If you just want to get a list of genes which are robustly expressed in each tissue then you can ignore the raw counts and use TPM (e.g. TPM >1 is reasonable).

ADD COMMENT
0
Entering edit mode

Thank you for you answer but I think that there is a statistical way to establish like a sort of cut-off above which we can consider that a particular gene is expressed. I mean, why TPM>1? Reading throughout different posts I realized that we cannot establish a cut-off (e.g. TPM>1) in a random way.

ADD REPLY
0
Entering edit mode

You could use the p values from a diff.express tool (e.g. DESeq2) for this. I've seen it suggested as a way to get a background set of genes e.g. for GO term analysis. So anything which has any value of p (significant or insignificant) given from the tool is expressed sufficiently to be processed by the tool. Basically just filter out all the "NA" values.

It's not as stringent as TPM>1 (not "robustly expressed genes") but maybe it would give you more statistical back-up if that's what you're after.

ADD REPLY

Login before adding your answer.

Traffic: 1891 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6