filtering genes by pearson correlation
1
0
Entering edit mode
6.7 years ago
mannoulag1 ▴ 120

Hi biostars,

I did a pearson correlation to my data (expression matrix), and I keep only the correlation >0.8 . How can I obtain the sub expression matrix of only these highly correlated genes. Thank you

data<-t(matrix)
cor = cor(data, use="pairwise.complete.obs", method="pearson")
cor<-cor[abs(cor)>0.8]
correlation RNA-Seq R • 2.2k views
ADD COMMENT
6
Entering edit mode
6.7 years ago

Extract the indices of the genes of interest with which() and the arr.ind option, e.g.

idx <- which(abs(cor)>0.8, arr.ind = TRUE)
correlated.genes <- data[idx, ]
ADD COMMENT
0
Entering edit mode

Thank you Jean-Karim, I do this :

#cor is symmetric, so we can keep only the half of the pairs of indices
idx<-which( (abs(cor) > 0.8) & (upper.tri(cor)), arr.ind=TRUE)
correlated.genes <- matrix[idx, ]

Then I have to remove the duplicated genes from 'correlated.genes' ?

ADD REPLY
1
Entering edit mode

This was just to give you quick pointer. What I think you want is to get unique indices. Something like:

idx <- which( (abs(cor) > 0.8) & (upper.tri(cor)), arr.ind=TRUE)
idx <- unique(c(idx[, 1],idx[, 2])
correlated.genes <- matrix[idx, ]
ADD REPLY
0
Entering edit mode

Thank you Jean-karim

ADD REPLY

Login before adding your answer.

Traffic: 2674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6