The findMarkers function of seurat allows users to specify latent variables to be adjusted for when finding differentially expressed genes. I am testing for differences in gene expression between 2 groups - disease vs normal. For the statistical test, I am using LR, described below:
LR: disease state is modelled as logistic regression with respect to gene expression level and latent variables (diagnosis ~ latent variables + gene.X). This is compared to a null model (diagnosis ~ latent.vars) via likelihood ratio to obtain final p value.
While this adjusts the p-value, the latent variables are not taken into account when calculating log fold change. Seurat simply compares the means of the normalized counts. My solution is simple, but relies on an assumption that I am not sure is correct. Here is the assumption:
Assume that unadjusted fold change is the product of fold change due to latent variables and fold change due to disease state
FC.raw = FC.latent * FC.diagnosis
Then FC.diagnosis = FC.raw / FC.latent.
The fold change due to latent variables can be calculated by modeling the normalized counts with respect to the latent variables and using the predicted counts from the model to calculate fold change.
Id like to know if this is a valid assumption, or if there are other methods of reporting adjusted fold changes for single cell.
Side note: I think you're referring to removing the
effect
, notaffect
.thanks Ram noted