Hello all,
I'm using R to investigate gene co-expression modules in a dataset and I would like to generate some 'noise' in the matrix, and then re-computing the co-expression modules using this noisy matrix. This would give me an idea of the robustness of the modules.
(In this case we have rows = genes, columns = samples)
The best Idea so far would be to re-order the rows for a subset of samples (and change the #samples reordered to simulate increasing amounts of noise).
Unfortunately I cannot find a way to reorder a subset of columns in a quick manner, e.g.
# copy gx matrix to scramble
gx <- gx.filt.test
# generate sample of cols to scramble:
scr.cols <- sample(x=ncol(gx), size=scr.number)
for(j in 1:length(scr.cols) )
{
gx[ , scr.cols[j] ] <- as.numeric( sample( x=gx[ , scr.cols[j] ], size=nrow(gx) ) )
}
Is unsurprisingly very slow due to the for() loop.
Using apply() is certainly faster, but is still quite slow:
gx[,scr.cols] <- apply( X=gx[,scr.cols], MARGIN=2, FUN=function(s) as.numeric(sample(s, size=length(s))) )
Is there an obvious way to do this that I'm missing, or a package that has a function to do so quickly?
Thanks very much in advance!
Is this something that vegan::permat could be used for?
Ah, yes, the
strata
option looks like it could possibly do the job - although from my reading of the manpage, it looks like that the permutations still occur within each specified stratum; whereas I want to permute the data within only one of those strata.Thank you for the suggestion, I will try it out next time - it turns out that the co-expression calculations take far longer than the sampling, so making this step slightly faster has no effect on the total runtime anyway...!