Question

An Easy Way To Permute/Randomise Biological Networks?

4

Entering edit mode

12.9 years ago

Darren J. Fitzpatrick ★ 1.1k

Hi,

Given a biological network, e.g., protein-protein interactions, genetic interactions, etc., is there a 'relatively easy way to permute such networks such that,

The Degree distribution is preserved.
There are no self interacting nodes.
There is only a single edge between any two nodes.

I have tried using the rewire function in the R igraph library. This preserves the degree distribution but allows for self and multiple loops.

Is there an off the shelf tool that does the above, particularly for undirected graphs?

Thanks, D.

network • 7.0k views

ADD COMMENT • link updated 12.9 years ago by Gotgenes ▴ 460 • written 12.9 years ago by Darren J. Fitzpatrick ★ 1.1k

score 5 · Answer 1 · 2012-05-22

Here is an implementation in Python excerpted from code I wrote for my dissertation. (I apologize for not posting it directly here on BioStars, but the code exceeded the character limit here.) The relevant method is EdgeSwapGraph.randomize_by_edge_swaps(). The algorithm is straightforward and should easily translate to your favorite programming language. Note the NetworkX library is a third-party dependency of this code, but installed simply with pip install networkx.

In practice, we found 10 iterations proved sufficient for our purposes, so I would start with that.

score 1 · Answer 2 · 2012-05-22

1

Entering edit mode

12.9 years ago

Istvan Albert 102k

One approach could be to relabel (randomize) the node ids. This will preserve the network structure but alter the lists of nodes selected via certain network properties.

ADD COMMENT • link 12.9 years ago by Istvan Albert 102k

2

Entering edit mode

This works, but note that randomizing node IDs will not preserve the degree distribution for the gene/protein/molecule, only for the network as a whole. In other words, if you started with 2 genes gene a and gene b that had 10 neighbors, there will still be 2 genes with 10 neighbors after randomization, but it may be 2 different genes, gene x and gene y, say, and gene a and gene b may have more or fewer neighbors. The method I posted ensures that if gene a starts with 10 neighbors, it will still have 10 neighbors after randomization. Maybe that property isn't important, but sometimes it can be.

ADD REPLY • link 12.9 years ago by Gotgenes ▴ 460

1

Entering edit mode

yes, good point, thanks for clarifying that

ADD REPLY • link 12.9 years ago by Istvan Albert 102k

0

Entering edit mode

I figured that it wouldn't preserve the degree distribution. I decided to use the mfinder package from Uri Alon. It was initially designed to look for over-represented motifs but also does the job of randomising networks.

ADD REPLY • link 12.9 years ago by Darren J. Fitzpatrick ★ 1.1k

1

Entering edit mode

Just a heads up, some people that I do trust say that the Uri Alon paper (and therefore I assume the software itself) is fundamentally incorrect. This is matters only if you looking for subgraph enrichment. See http://www.biomedcentral.com/1752-0509/2/73

ADD REPLY • link 12.9 years ago by Istvan Albert 102k