Hi guys,
I would like to know if it is possible to compute the half of a correlation matrix? I mean, I don't want to compute the whole matrix, and then extract just the upper or the lower half - I want to get directly the half matrix?
I use R. I have already a code but it gives the whole matrix :
data<-read.table("/home/vipailler/PROJET_M2/raw/truelength2.prok2.uniref2.rares.tsv", sep="\t", h=T, row.names=1)+1
res <- foreach(i = seq_len(ncol(data)),
.combine = rbind,
.multicombine = TRUE,
.inorder = FALSE,
.packages = c('data.table', 'doParallel')) %dopar% {
apply(data, 2, function(x) 1 - ((var(data[,i] - x)) / (var(data[,i]) + var(x))))
}
Thanks
For the record, this code is mine, from my previous answer:
I read some documentation about the functions upper.tri and lower.tri but I don't know if it could suitable in this code?
Yes, I know what you are aiming to do. A data-frame/matrix is 'rectangular' in structure, though. If you just fill the upper part, you still have to have the bottom part, even if the cells are empty.
My goal here is to gain time. So, if I fill the upper or the lower part with 0 , and is I compute the correlations on the other part, do I will gain time?
You may just have to be patient. You could add a line such as this to your foreach loop:
Then it will print the value of
i
after 100 values are processed. You may see, for example, 300 coming before 200 based on how the parallel processing works if one core finishes before the other.Correlations for what?
It is correlations between OTUs
You need to provide more information. Which language (R, python, etc...).
The answer you get is at most as good as the question you ask:
Sorry for the lack of information. I answered above.
Could you show us how your data is formated ? thanks