Question

Removing batch effects

0

Entering edit mode

4.1 years ago

avelarbio46 ▴ 30

Hello everyone!

I've been trying to analyze our RNAseq samples vs samples obtained from external data sources. For example, analyze Breast Cancer samples obtained in one study vs the one obtained in TCGA.

Even if the samples are equal, biologically speaking, there are many differences in the samples because of "batch" effect (different kits, different populations, different machines etc).

What I'm doing is: Create DeSeq2 object in R Obtain FPM values - fpm(deseqobject) Remove noise genes (low reads, pseudogenes, etc) Compare clustering algorithms and visualize everything with a PCA.

Of course, when doing this all samples from one study cluster together, while samples from tcga forms another cluster. I don't want to do differential expression analysis, I need the FPM counts of all samples

Is there a way to remove all "batch" effects in DeSeq2, for this purpose (using batch effect as covariate, maybe?)

Or should I remove batch effect by using limma or ComBat?

I know there are some responses here and there, but most people want to use batch effect removal to do Differential Expression, so I thought that asking would be best

batch-effect RNA-Seq • 1.5k views

ADD COMMENT • link updated 4.1 years ago by swbarnes2 14k • written 4.1 years ago by avelarbio46 ▴ 30

score 0 · Answer 1 · 2021-03-09

0

Entering edit mode

4.1 years ago

swbarnes2 14k

If all your control samples come from one lab, and your cancer samples all come from another lab, there is no math magic that will remove that batch effect.

ADD COMMENT • link 4.1 years ago by swbarnes2 14k

0

Entering edit mode

I've been researching about this, because it seems like integration of data across studies is very important. But, of course, there is a big problem of "batch" effect. Many papers are adressing this, and I've found some nice tools such as ComBat-seq (an updated version of ComBat) which can be used for more extreme batch differences in samples (https://academic.oup.com/nargab/article/2/3/lqaa078/5909519)

Of course there is no "magic" math, but real researchers are trying different approaches to solve this type of problem, even though you might not like this type of analysis

ADD REPLY • link 4.1 years ago by avelarbio46 ▴ 30

0

Entering edit mode

It's not about what I like. If batch effect is confounded with experimental differences, there is no algorithm that will separate them.

ADD REPLY • link 4.1 years ago by swbarnes2 14k