How to choose the biological variables if you want to keep all the biological signal from your data when you have to adjust by batch?
0
0
Entering edit mode
2.3 years ago
ev97 ▴ 40

For RNA data, I have seen two available tools for adjusting the data by batch or other variables.

  • ComBat-Seq (sva package)
  • limma (removeBatchEffect function)

Functions:

ComBat-Seq

ComBat_seq(counts, batch, group = NULL, covar_mod = NULL, full_mod = TRUE, shrink = FALSE, shrink.disp = FALSE, gene.subset.n = NULL)

Where:

  • group: Vector / factor for biological condition of interest

  • covar_mod: Model matrix for multiple covariates to include in linear model (signals from these variables are kept in data after adjustment)

limma

removeBatchEffect(x, batch=NULL, batch2=NULL, covariates=NULL, design=matrix(1,ncol(x),1), ...)

Where:

  • design: design matrix relating to treatment conditions to be preserved, usually the design matrix with all experimental factors other than the batch effects.

In both methods, you have a parameter where you can put some biological variables/conditions that you would like to keep (biological signal) after the adjustment.

If you have a lot of biological information that you want to keep, the majority of the people would think to keep the maximum possible biological signal. However, this is not possible, because if you keep everything, you won't really adjust by anything.

Question: From your point of view/experience, what do you think or how do you usually make decisions about how to choose the variables that you are going to use?

I would really appreciate any feedback.

Thanks very much in advance

rnaseq combat-seq batch removeBatchEffect limma • 570 views
ADD COMMENT

Login before adding your answer.

Traffic: 2537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6