Question

Reasoning - Why is it invalid to compare ratios of TPMs?

0

Entering edit mode

4 months ago

VTh • 0

Dear community,

my supervisor has asked me to analyse bulk RNAseq. The samples were acquired at different times in different experiments without shared controls, so no correction for batch effects is possible and differential gene expression analysis is unreliable.

We are discussing the analyses that we can do considering the imperfect situation. One thing that seems to be possible is to perform within-sample comparions, by calculating TPMs and comparing expression of GeneX to GeneY.

Now, why can't we take the ratio of GeneX / GeneY and compare these ratios between SampleA and SampleB?

I have a hard time justifying why I think this is a bad idea, because I share the intuition. My "hunch" is that that by calculating the ratio, we are basically performing count normalization, but with just GeneY instead of an "average" over all genes as with DeSeq2 / limma's TMM. The comparison of the ratios is therefore equivalent to a differential gene expression analysis, which we previously established is flawed, using normalization on a single gene. However, this does not make batch effects go away, they're just ignored. Thus, this approach is only "valid" if GeneX and GeneY happen to be exempt from any biases. Would you agree with this?

Somehow I have a hole in my thinking going from "TPMs are okay intra-sample" to "comparing ratios of valid intra-sample TPMs across samples is invalid". Doesn't this by proxy imply that performing intra-sample TPM comparisons is kind of moot anyway because the information gained cannot be put into context?

Maybe someone can help me nudge my train of thought in a productive direction.

Thanks!

RNAseq analysis expression normalization differential TPM batch effect • 821 views

ADD COMMENT • link updated 4 months ago by ATpoint 89k • written 4 months ago by VTh • 0

score 6 · Accepted Answer · 2025-03-12

6

Entering edit mode

4 months ago

ATpoint 89k

The way batch effects work is that some genes in a non-reproducible fashion are affected more than others. Hence, the ratio of X over Y in batch 1 is not the same as in batch 2, and as such the ratios are not meaningful.

I cannot emphasize enough to not engage in trying to suck something out of confounded experiments. It is tempting, and I have been there as well, but in the end you always realize that the batch is present, cannot be corrected, and flaws your efforts. Just don't. It is not worth the time.

ADD COMMENT • link 4 months ago by ATpoint 89k

0

Entering edit mode

Thanks. I agree with your sentiment. This is off topic to the general question, but I still have a mental disconnect somewhere at the within-sample TPMs though. When would they ever be useful if information gained from them cannot easily (at all?) be extrapolated to other samples?

ADD REPLY • link 4 months ago by VTh • 0

0

Entering edit mode

I can only speak for myself, but I have never looked at gene ratios, nor do I see its usecase.

ADD REPLY • link 4 months ago by ATpoint 89k