Entering edit mode
10.8 years ago
faustinomarin10
▴
10
Hello,
In the GEO2R R scripts to analyse microarray data (https://www.ncbi.nlm.nih.gov/geo/info/geo2r.html), there is an auto-detect feature to determine if the values are or are not in log space:
ex <- exprs(gset)
qx <- as.numeric(quantile(ex, c(0., 0.25, 0.5, 0.75, 0.99, 1.0), na.rm=T))
LogC <- (qx[5] > 100) || (qx[6]-qx[1] > 50 && qx[2] > 0) || (qx[2] > 0 && qx[2] < 1 && qx[4] > 1 && qx[4] < 2)
if (LogC) {
ex[which(ex <= 0)] <- NaN
exprs(gset) <- log2(ex)
}
My question is, how these specific quantile conditions (i.e. qx[5]>100
) can determine if the values are or not in log space?
Thanks a lot
I agree with your reasoning. But, why specifically 100? I wonder if it is arbitrarily chosen.
Also, concerning the values for the other quantile conditions, I think other different explanations would be needed.
It's a nice round number, 99 or 101 would work just as well. That's also why 50 is used in testing the range.