Well this is old but still we were able to solve this issue and I'd like to share it so no one else will suffer.
I assume the two chips you use share some probesets and you only want to deal with those shared ones as you can't really compare non existing probes.
You need to tinker with the code of rma function. You can see the code by ctrl + click on the rma function written in your script but I'll try to clear enough here so you won't need to do that.
Also edit: do not load Gdata package beforehand
The critical parts are 1) :
exprs <- .Call("rma_c_complete_copy", pm(object, subset),
pNList, ngenes, normalize, background, bgversion,
verbose, PACKAGE = "affy")
that outputs the resulting expressions and 2)
new("ExpressionSet", phenoData = phenoData(object), annotation = annotation(object),
protocolData = protocolData(object), experimentData = experimentData(object),
exprs = exprs
that creates the resulting object of the function.
In part 1 inputs normalize
, background
, bgversion
and verbose
normally comes from the inputs of the original rma function. Fill them as you would normally. To set them to default just use
verbose = TRUE
destructive = TRUE
normalize = TRUE
bgversion = 2
There are 3 things you need to create manualy: pNList, nGenes and pm(object, subset). Read both groups seperately. I just placed them in different directories
setwd('HGU133a')
affyA <- ReadAffy()
setwd('..')
setwd('HGU133b')
affyB <- ReadAffy()
setwd('..')
You want to get the common probes so do
pNListA = probeNames(affyA)
pNListB = probeNames(affyA)
subsetList = pNListA[pNListA %in% pNListB]
Now you have the subsets so you can request pms of both samples and stitch them together
subsetPmA = pm(affyA, unique(subsetList))
subsetPmB = pm(affyB, unique(subsetList))
allPm = cbind(subsetPm, subsetPmOldOrdered) #this will go into the .call function
The two variables left for part 1 is simple just do
ngenes = length(unique(subsetList))
pNList = split(0:(length(subsetList) - 1), subsetList)
and run part 1 to get normalized expression values
exprs <- .Call("rma_c_complete", allPm,
pNList, ngenes, normalize, background, bgversion,
verbose, PACKAGE = "affy")
To create the new object you need to stitch the components of the two objects together. For annotation, if one of your chips has a subset of the probes in the other probe, just use that one, but I don't think it matters that much, you have what you need at this point. Just don't mix up the order when you are using combine
.
phenoD = combine(phenoData(affyA), phenoData(affyB))
annot = annotation(affyA)
protocolD = combine(protocolData(affyA), protocolData(affyB))
experimentD = experimentData(affyA)
newNormalized = new("ExpressionSet", phenoData = phenoD, annotation = annot,
protocolData = protocolD, experimentData = experimentD,
exprs = exprs)
That's it. you now have your handcrafted rma output. Use it as you normally would.
many thanks for your reply sean, istvan and fanofactor.
thank you all again.
best,
mohan
With regard to #3, combineAffyBatch is for combining arrays that share content. Hgu133a and hgu133b do not share content, so this will be equivalent to combining rows.
With regard to #2, note that correlation is scale-free. Try this experiment:
That being the case, scaling the rows will not change a "co-expression" analysis that relies on correlation (and most do).