Batch Effect normalization with Combat generates error
1
0
Entering edit mode
9.5 years ago
madkitty ▴ 690

I have 6 batches (10 samples in each) from 6 arrays that we want to normalize for batch effect with Combat. When I run the script below, I encounter an error, any idea how to fix this?

I'm editing the current code with all modification and current error message that I get.

library(sva)

dat = read.csv("Combat_matrix_input.csv");
sif = read.csv("sif.csv");
batch = sif$Batch

head(batch)
[1] Batch1 Batch1 Batch1 Batch1 Batch1 Batch1
Levels: Batch1 Batch2 Batch3 Batch4 Batch5 Batch6

modcombat = model.matrix(~1, data=dat)
dat <- as.matrix(dat)
combat_edata = ComBat(dat=dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)
Found 6 batches
Error in cbind(batchmod, mod) :
  number of rows of matrices must match (see arg 2)

# Here I'm trying with batch as.vector
batch = as.vector(sif$Batch)
combat_edata = ComBat(dat=dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)
Found 6 batches
Error in cbind(batchmod, mod) :
  number of rows of matrices must match (see arg 2)

Here I display the output of column Batch in sif

# I tried both batch as.vector as you guys recommended
batch = as.vector(sif$Batch)
batch = sif$Batch

# Both batch as.vector or regular variable return the error in combat before, and return the same head()
head(sif$Batch)
[1] Batch1 Batch1 Batch1 Batch1 Batch1 Batch1
Levels: Batch1 Batch2 Batch3 Batch4 Batch5 Batch6
microarray normalization combat • 10k views
ADD COMMENT
0
Entering edit mode

Can you show us the output of

head(sif$Batch)
ADD REPLY
0
Entering edit mode

@andrew.j.skelton73: I updated the original post with the output of head(sif$Batch). Thanks :)

ADD REPLY
0
Entering edit mode

Will as.vector(sif$Batch) help?

ADD REPLY
0
Entering edit mode

Just found my script. This was what I did with ComBat

I was trying to merge RNA Seq data with microarray. mydata is my microarray data and vsd is my RNA-Seq data whereas data.input is the merge data frame

pheno = data.frame(row.names=row.names(pData(mydata)), timepoint= c(rep("GD9.5",5), rep("GD11.5",4),rep("GD13.5",6), "GD9.5"), platform="1", condition="Control")
phenoRNA = data.frame(row.names=colnames(assay(vsd)), condition=c(rep("Case", 5), rep("Control", 5)), platform="2", timepoint="GD9.5")
phenotype = rbind(pheno, phenoRNA)
mod = model.matrix(~as.factor(condition), data=phenotype )
mod0 = model.matrix(~1,data=phenotype )
combat_edata = ComBat(dat=data.input, batch=batch, mod=mod, numCovs=NULL, par.prior=TRUE, prior.plots=FALSE)
pValuesComBat = f.pvalue(combat_edata,mod,mod0)
ADD REPLY
0
Entering edit mode

Thanks Sam, though in our case we only have genes in rows and samples in columns. The sample info file contains only array, sample_name and batch. Without any condition, how should I write mod? I don't understand what should go under ~as.factor()

ADD REPLY
0
Entering edit mode

@Sam: I tried that

batch = as.vector(sif$Batch)

Unfortunately the result is the same..

ADD REPLY
1
Entering edit mode

According to page 6 of the user manual,

batch = pheno$batch
modcombat = model.matrix(~1, data=pheno) 
combat_edata = ComBat(dat=edata, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)

The deviation of this code and yours is that instead of data=pheno, you use data=dat. If you look at my example script, you should note that the phenois the phenotype matrix. What I suspect is the data in the mod should contain the phenotype and the batch instead of the count table. try and see if

modcombat = model.matrix(~1, data=sif)

solves the problem? That's just my guess though

ADD REPLY
0
Entering edit mode

Hi Sam, Thanks again for your answer. When replacing for data=sif in modcombat, I get another type of error (see below). The manual isn't useful at all .. I'm really desperate to find a solution :(

Found 6 batches
Adjusting for 0 covariate(s) or covariate level(s)
Standardizing Data across genes
Error in solve(t(design) %*% design) %*% t(design) %*% t(as.matrix(dat)) : 
  requires numeric/complex matrix/vector arguments
ADD REPLY
0
Entering edit mode

Can you show us how your sif looks like?

ADD REPLY
0
Entering edit mode

Actually, I gave it another try, and indeed your suggestions worked well. Just for those wondering, the manual isn't clear, MOD should contain SAMPLE INFO FILE and COVARIATE information as a data.frame. Thanks Sam!

ADD REPLY
0
Entering edit mode

Good to know that it works

ADD REPLY
0
Entering edit mode
9.5 years ago

I'm pretty sure that your variable should be numeric to indicate batches. So change sif$Batch to numeric

ADD COMMENT
0
Entering edit mode

Thanks, I've tried to define batch = as.numeric(sif$batch) It really didn't work ..

ADD REPLY
1
Entering edit mode

ok, change the 'mod' parameter in your combat call to 'NULL' - So,

combat_edata = ComBat(dat=dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)

is changed to

combat_edata = ComBat(dat=dat, batch=batch, mod=NULL, par.prior=TRUE, prior.plots=FALSE)
ADD REPLY

Login before adding your answer.

Traffic: 2546 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6