How To Define Batch And Covariate To Remove Batch Effects With Combat.R
1
0
Entering edit mode
11.5 years ago
fbrundu ▴ 350

Hi all, newbie here.

I am trying to remove batch effects from a Affymetrix microarray data set (this one) using ComBat.R.

I have loaded CEL files into dchip and executed dchip's "Normalize & Model" and "Export expression value". In this way I obtained an expression value file "dChip_signal.xls", first parameter to input to ComBat.R, that contains both signal and call values.

Second parameter is a sample information file, tab-separated-value file composed in this way:

Array name   Sample name   Batch   Covariate1(treatment) ...
Array1       Sample1       1       Tissue1               ...
...

Ref for this parameters is here

I have read that batch effects are "Batch effects are technical sources of variation that have been added to the samples during handling" here. I do not understand how to assess and determine batch and covariates for each array of dataset (that as far as I know must be identified by a CEL file). Am I missing anything?

I will provide more informations if needed.

Thanks

microarray affymetrix • 9.2k views
ADD COMMENT
1
Entering edit mode
11.5 years ago
Neilfws 49k

You need to think about how the arrays were processed. Were they scanned on different days (CEL files should include scan date information)? In different labs by different people? If so, there is a potential for batch effects.

ADD COMMENT
0
Entering edit mode

Thanks for the suggestion.. however I think this data has almost certainly batch effects, as some works I saw using it tell that the processing pipeline includes removing batch effects.. Anyway I will try to figure out how to divide data in batches; a solution could be to divide each array serie in different batches.. maybe each CEL a correspondent batch, but I am still unsure it could be successful..

ADD REPLY
0
Entering edit mode

You should not divide them yourself, arbitrarily. There should be a batch factor, which you already know, which come from either running samples on different days (one dimension) or different machine (another dimension), etc. Does it help?

ADD REPLY

Login before adding your answer.

Traffic: 1606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6