Hi all, newbie here.
I am trying to remove batch effects from a Affymetrix microarray data set (this one) using ComBat.R.
I have loaded CEL files into dchip and executed dchip's "Normalize & Model" and "Export expression value". In this way I obtained an expression value file "dChip_signal.xls", first parameter to input to ComBat.R, that contains both signal and call values.
Second parameter is a sample information file, tab-separated-value file composed in this way:
Array name Sample name Batch Covariate1(treatment) ...
Array1 Sample1 1 Tissue1 ...
...
Ref for this parameters is here
I have read that batch effects are "Batch effects are technical sources of variation that have been added to the samples during handling" here. I do not understand how to assess and determine batch and covariates for each array of dataset (that as far as I know must be identified by a CEL file). Am I missing anything?
I will provide more informations if needed.
Thanks
Thanks for the suggestion.. however I think this data has almost certainly batch effects, as some works I saw using it tell that the processing pipeline includes removing batch effects.. Anyway I will try to figure out how to divide data in batches; a solution could be to divide each array serie in different batches.. maybe each CEL a correspondent batch, but I am still unsure it could be successful..
You should not divide them yourself, arbitrarily. There should be a batch factor, which you already know, which come from either running samples on different days (one dimension) or different machine (another dimension), etc. Does it help?