I encountered the exact same error when I try to remove known batches using combat function, I found 2 problems matters:
- the type of 'dat', it should be matrix instead of data.frame;
- variance of variables in 'dat' should not equal zero.
Once these 2 conditions satisfied, you can run combat successfully.
I attached the code and running log below to make it more clearly.
> batch = wu_our_cli$group
> modcombat = model.matrix(~1, data=wu_our_cli)
> eRNA=wu_our_RNA[apply(wu_our_RNA,1,var)>0,]
> class(wu_our_RNA)
[1] "data.frame"
> combat_edata = ComBat(dat=as.matrix(wu_our_RNA), batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=F)
Found2batches
Adjusting for0covariate(s) or covariate level(s)
Standardizing Data across genes
Fitting L/S model and finding priors
Finding parametric adjustments
Error in while (change > conv) { : missing value where TRUE/FALSE needed
> combat_edata = ComBat(dat=as.matrix(eRNA), batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=F)
Found2batches
Adjusting for0covariate(s) or covariate level(s)
Standardizing Data across genes
Fitting L/S model and finding priors
Finding parametric adjustments
Adjusting the Data
> combat_edata = ComBat(dat=eRNA, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=F)
Found2batches
Adjusting for0covariate(s) or covariate level(s)
Standardizing Data across genes
Error in ((dat - t(design %*% B.hat))^2) %*% rep(1/n.array, n.array) :
requires numeric/complex matrix/vector arguments
I find it strange that you are using ComBat on logged values. Should it not be performed on unlogged data? If you have a batch effect, try to correct for it in the design model of whichever RNA-seq analysis program that you are using. ComBat is an extreme form of batch correction.
In any case, it looks like there may be NA values in key parts of your data.
You have a couple of options:
Remove rows (genes?) with any NA value:
Convert NA value to zero
Convert NA values to half the min
Hey Kevin, it can be any kind of values.
i found this: https://stackoverflow.com/questions/21532998/error-when-using-combat https://groups.google.com/forum/#!msg/combat-user-forum/_z8DxYQNFJ8/7UI_a2nCoUEJ
it seems there should be a problem with the variance, i dont have NA values in my datamatrix
Yes, I saw that thread. Rows of constant variance will cause problems too. If you are using logged data, it is more probably to have rows of constant variance due to the transformation.
Did you try ComBat on the un-logged counts?
You can check variance with the
var()
command. For example, to check if a row has constant variance, then useapply(RNA_seq_log2_D1, 1, var)!=0
to create a TRUE/FALSE vector, which you can then use to filter.Hey Kevin, thanks for your answer. It seems there's no problem about it.
I tried the Combat with the TPM and I get the same error.
It's kind of strange. If i quantile normalize my dataset (log2 and counts) there is no error anymore.
That is strange. It would be great to see the distribution of each data with the
hist()
function! That may give more information.Histogram (the log2 data)
That's more like an inverse hypergeometric distribution, as opposed to normal/binomial. I wonder if that's part of the issue. There are many counts near 0.
Quantile normalising will produce a more 'normal' distribution, which is perhaps why that works.
Gracias
It seems you're right. So running ComBat with the not-quantile-normalized dataset and with
par.prior = FALSE
which assumes your distribution as nonparametric it worked!That's very interesting!
There is another possibility: you probably didn't remove constant genes within one batch. Check the numbers batch by batch, and remove those genes with same values within one batch, and then you can perform parametric adjustment as well.
Could you give an example of how you could do this?