Hi,
I have an expression matrix : gse14814.gcrma
, i have normalised and filtered this.
current_study <- gse14814.gcrma[gse14814_probes,]
These are the genes that i wish to use, along with there expression values.
I have created an empty matrix di_matrix
with the dimensions of the above expression matrix(90 rows, by 1567 genes/cols), i fill this in with dichotomized values based on the expression. So for each gene I want to calculate the median expression for that column/gene. Then for each array for each gene if the gene is above the median assign a 1 to the same position in di_matrix
, if lower assign 0 to the same location on the di_matrix.
So I think I should create a for loop:
rownames(di_matrix) <- sampleNames(current_study)
colnames(di_matrix) <- featureNames(current_study)
for (i in 1:1567) {
medianVal <- median(exprs(current_study[,i]))
current_logical <- exprs(current_study[,i]) > medianVAL
current_di_gene <- as.numeric(current_logical)
di_matrix[,i] <- current_di_gene
}
This is wrong , its giving me back
Error in gse14814dimatrix[, i] <- currentdigene : number of items to replace is not a multiple of replacement length
Im sorry, I dont have a lot of experience in R, im very much a beginner.
Thanks for the help, R
Try to use apply functions instead of loops. http://www.ats.ucla.edu/stat/r/library/advanced_function_r.htm
Just a comment here since I think the answers are going to get you where you need to go. If you find yourself using a "for" loop over rows or columns, you should look for an "apply" that fits your needs instead. Using an "apply" can sometimes be orders-of-magnitude faster for the same result.
Ill test this in the morning, thanks for the advice, really appreciate it guys