Is ComBat not able to to handle only two batches? What are good alternatives?
1
0
Entering edit mode
4 months ago
samuelandjw ▴ 260

Final Edit: The lesson here is, don't convert factor variables with only two levels to factor types. Just leave them as numerics.

Edit 2: Sorry I have wrongly inferred ComBat not being able to handle 2 batches.

Original: I have an Olink NPX dataset that need batch correction with only 2 batches. I tried ComBat, it would stop running with the error message "At least one covariate is confounded with batch! Please remove confounded covariates and rerun ComBat", with or without the design matrix. I looked into the code and it seems that ComBat is not able to handle batch corrections with only 2 batches.

Am I correct? What are good alternatives if I am to do batch corrections to dataset with only 2 batches?

Edit1:

While I cannot post the covariates and batch variables directly, I can provide the correlation matrix of design, the structure used in ComBat to infer confounding:

structure(c(1, -1, 0, 0, -0.0260011520030977, 0.0791694781327139,
-0.0460764787433406, 0.108519031075261, 0.0123805848206014, -1,
1, 0, 0, 0.0260011520030977, -0.0791694781327139, 0.0460764787433406,
-0.108519031075261, -0.0123805848206014, 0, 0, 1, -1, -0.260035694607971,
-0.608536557719787, -0.105492246706702, -0.412662510120705, 0.0596664424282021,
0, 0, -1, 1, 0.260035694607971, 0.608536557719787, 0.105492246706702,
0.412662510120705, -0.0596664424282021, -0.0260011520030977,
0.0260011520030977, -0.260035694607971, 0.260035694607971, 1,
0.264457402769079, 0.10259294363056, 0.249699029250425, 0.0500069775024256,
0.0791694781327139, -0.0791694781327139, -0.608536557719787,
0.608536557719787, 0.264457402769079, 1, 0.0162943643294796,
0.678122792929598, -0.0214816402734835, -0.0460764787433406,
0.0460764787433406, -0.105492246706702, 0.105492246706702, 0.10259294363056,
0.0162943643294796, 1, -0.000200465769279558, 0.0884104795401378,
0.108519031075261, -0.108519031075261, -0.412662510120705, 0.412662510120705,
0.249699029250425, 0.678122792929598, -0.000200465769279558,
1, -0.0710095314512266, 0.0123805848206014, -0.0123805848206014,
0.0596664424282021, -0.0596664424282021, 0.0500069775024256,
-0.0214816402734835, 0.0884104795401378, -0.0710095314512266,
1), dim = c(9L, 9L), dimnames = list(c("batch0", "batch1", "as.factor(gender)0",
"as.factor(gender)1", "age", "as.factor(smoking_status)1",
"BMI", "pack_year", "second_hand_smoking"), c("batch0",
"batch1", "as.factor(gender)0", "as.factor(gender)1", "age",
"as.factor(smoking_status)1", "BMI", "pack_year", "second_hand_smoking"
)))
ComBat batch correction • 392 views
ADD COMMENT
0
Entering edit mode

Can you show a table summarizing the conditions of your experiment and the batches so which samples are which batch. The error tells you that either the batches are nested with each other or linear with a condition.

ADD REPLY
0
Entering edit mode

Thank you! Please see my edit.

ADD REPLY
2
Entering edit mode
4 months ago

ComBat can handle 2 batches just fine as far as I know. More likely, your batches are totally aligned with your covariate of interest, thereby making it impossible to account for properly. Posting your sample metadata and code would help us determine if that is indeed the case.

As for solutions, there's nothing you can do. Put more thought into experimental design next time to ensure samples from all groups of interest are spread across batches.

ADD COMMENT

Login before adding your answer.

Traffic: 2287 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6