Question

How to evaluate gene expression

0

Entering edit mode

4.6 years ago

Lorenzo ▴ 10

Hi Biostars community,

Let me explain my situation: I have to determine the expression of different genes in different tissues. Each tissue has a biological replicate coming from another individual (the only difference is 3 months in term of age). These two individual are healthy so my expectation is that there is no a significant difference in gene expression between them. So it is not useful in this case to perform DEA with DESeq and so on.

I calculated TPM values from raw reads with Salmon. I know that there are a lot of posts about this but in each post there are different information but none of them have a definitive answer.

My question is: in my situation is it fine to utilize TPM discarding all summarized gene counts other than those with raw counts > 10 and consider expressend genes with row counts >10? Or it is better to move toward another normalization method like zFPKM to identify the threshold between active and background gene expression?

I'm sorry if this post sounds repetitive with others. I really appreciate any suggestions.

Thank you for your time!

RNA-Seq • 1.9k views

ADD COMMENT • link updated 4.6 years ago by matt.a.bennett25890 ▴ 30 • written 4.6 years ago by Lorenzo ▴ 10

0

Entering edit mode

If you are working with different tissues, why can't you do a differential expression analysis? Despite having 2 healthy individuals, you have 2 different tissues, which are 2 different conditions. This is what DESeq2 is actually doing, comparing different conditions.
The issue of replicates coming from 2 different individuals can be easily addressed with a batch effect normalization (you have how to do it in this post).

ADD REPLY • link 4.6 years ago by jordi.planells ▴ 480

0

Entering edit mode

Actually I'm not working with 2 different tissues but with a lot. For each tissue of an individual I have the biological replicate of the same tissue in the other individual. I don't know if I have a consistent batch effect because the procedures (e.g. poly-A selection, library preparation etc.) were performed the same day. Beside this I could have, of course, a strong batch effect and I'm working to construct a PCA plot to establish it and eventually correct it.

Anyway with DESeq is possible to set a tissue as reference and perform the DEA versus the other tissues, right? So your advice is to set every time a tissue as reference and then perform DEA against the other tissues?

ADD REPLY • link 4.6 years ago by Lorenzo ▴ 10

0

Entering edit mode

Exactly! In a 2 tissues example, being A the reference tissue:
You will find that the vast majority of genes won't change, some of them will have a positive log2FoldChange (B specific) and some will have negative log2FoldChange (A specific).

ADD REPLY • link 4.6 years ago by jordi.planells ▴ 480

0

Entering edit mode

Probably it is a stupid question so sorry. But is it possible to compare A vs B,C,D and so one in the same time or is it necessary to compare A vs B, then A vs C and so on?

ADD REPLY • link 4.6 years ago by Lorenzo ▴ 10

0

Entering edit mode

You can do all the comparisons at the same time, read in DESeq2 vignette about the design if you have any doubts. As you are only comparing 1 variable (tissue) it will (should) be easy. Good luck!

ADD REPLY • link 4.6 years ago by jordi.planells ▴ 480

score 0 · Answer 1 · 2020-12-03

0

Entering edit mode

4.6 years ago

matt.a.bennett25890 ▴ 30

Not sure what you mean by "active" and "background" gene expression. If you just want to get a list of genes which are robustly expressed in each tissue then you can ignore the raw counts and use TPM (e.g. TPM >1 is reasonable).

ADD COMMENT • link 4.6 years ago by matt.a.bennett25890 ▴ 30

0

Entering edit mode

Thank you for you answer but I think that there is a statistical way to establish like a sort of cut-off above which we can consider that a particular gene is expressed. I mean, why TPM>1? Reading throughout different posts I realized that we cannot establish a cut-off (e.g. TPM>1) in a random way.

ADD REPLY • link 4.6 years ago by Lorenzo ▴ 10

0

Entering edit mode

You could use the p values from a diff.express tool (e.g. DESeq2) for this. I've seen it suggested as a way to get a background set of genes e.g. for GO term analysis. So anything which has any value of p (significant or insignificant) given from the tool is expressed sufficiently to be processed by the tool. Basically just filter out all the "NA" values.

It's not as stringent as TPM>1 (not "robustly expressed genes") but maybe it would give you more statistical back-up if that's what you're after.

ADD REPLY • link 4.6 years ago by matt.a.bennett25890 ▴ 30