How to manage self loops in gene coexpression network constructed from RNA-seq data
1
1
Entering edit mode
6.5 years ago
aishu.jp ▴ 10

I have a control vs treated RNA-seq plant data for which I am trying to construct gene coexpression network.I identifed a total of 6000 genes are significantly differential expressed genes using DESeq2 R package after applying FDR cutoff 0.05.

The normalised count matrix of these 6000 genes derived after rlog transformation was inputed to Cor() function and Pearson correlation was applied. The pair wise correlation analysis gave ~30 million gene pairs out of which, 1380285 gene pairs were selected with a cutoff >0.95 and were visualized using cytoscape

While visualising the network in cytoscape. I observed self loop for all genes in the network.

  1. Is the presence of self loop for all genes is biologically correct or not.
  2. If it's not correct, how to avoid self loops in all genes in the network and retain only the biologically significant one's
RNA-Seq R gene cytoscape • 2.0k views
ADD COMMENT
0
Entering edit mode

Thank you sir for your help.

Do any clustering techniques reduce the gene pairs and self looping

ADD REPLY
2
Entering edit mode

Clustering algorithms will just cluster whatever data you provide. To remove self-loops, you can use the NetworkAnalyzer plugin for Cytoscape or just remove them in your correlation matrix after you generate it.

For example, you could set all perfect correlations to NA or some low value, such that they will be filtered:

cormat
             [,1]         [,2]        [,3]        [,4]         [,5]
[1,]  1.000000000 -0.008671749 -0.13205923 -0.12919820  0.005133225
[2,] -0.008671749  1.000000000 -0.04800790  0.16655794 -0.075665340
[3,] -0.132059234 -0.048007902  1.00000000  0.08883567 -0.046017194
[4,] -0.129198197  0.166557935  0.08883567  1.00000000 -0.344062904
[5,]  0.005133225 -0.075665340 -0.04601719 -0.34406290  1.000000000

cormat[cormat==1] <- NA

cormat
             [,1]         [,2]        [,3]        [,4]         [,5]
[1,]           NA -0.008671749 -0.13205923 -0.12919820  0.005133225
[2,] -0.008671749           NA -0.04800790  0.16655794 -0.075665340
[3,] -0.132059234 -0.048007902          NA  0.08883567 -0.046017194
[4,] -0.129198197  0.166557935  0.08883567          NA -0.344062904
[5,]  0.005133225 -0.075665340 -0.04601719 -0.34406290           NA
ADD REPLY
2
Entering edit mode

Maybe diag is safer?

diag(cormat) <- NA

Other useful functions: lower.tri, upper.tri

ADD REPLY
0
Entering edit mode

Good point, zx8754

ADD REPLY
1
Entering edit mode
6.5 years ago

By generating a correlation matrix and filtering based on Pearson correlation r>0.95, you are therefore only including positive (not inverse) correlations and, in your final list, there will also be the correlations where each gene is correlated to itself, which would create the self-loops.

You can remove these with the NetworkAnalyzer plugin for Cytoscape.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 2348 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6