Question

Efficient Network Motif Finder

10

Entering edit mode

14.3 years ago

Simon Cockell 7.4k

I'm currently trying to find over-represented network motifs in a (relatively) large graph. I've experimented with a few tools, but found them all to be unworkably slow (one tool is still running, 2 weeks later).

Could I ask for your recommendations for a (fast) algorithm to solve this problem?

(PS, I'm currently looking for 6 node motifs in a network of ~120,000 nodes & ~600,000 edges)

EDIT - thanks for the answers so far, I'm trying out FANMOD, but it's 27 hours in and still hasn't got past 1% of the input graph, let alone onto analysing random graphs. Not sure there's a ready made solution to this problem.

graphs network • 8.3k views

ADD COMMENT • link updated 14.3 years ago by Istvan Albert 101k • written 14.3 years ago by Simon Cockell 7.4k

1

Entering edit mode

I think in a graph that large the most you can do is compile node/edge statistics (degree distributions, clustering coefficients etc). Most graph algorithms are probably unfeasible.

ADD REPLY • link 14.3 years ago by Istvan Albert 101k

0

Entering edit mode

What tools have you tried?

ADD REPLY • link 14.3 years ago by Neilfws 49k

0

Entering edit mode

Good point - currently running - Kavosh (2 weeks and counting) and mfinder (a week and still going).

ADD REPLY • link updated 6.2 years ago by Ram 44k • written 14.3 years ago by Simon Cockell 7.4k

0

Entering edit mode

Is your graph node and/or edge labeled? Having a hypothesis helps i.e. if you have some idea on how a netowork motif looks (and what its edge/node properties are) you can implement a function that looks for it.

ADD REPLY • link 14.3 years ago by Marcin Cieslik ▴ 520

0

Entering edit mode

Those interested in network motifs should also read the paper here written by colleagues here at Penn State. It makes a strong point that the seminal papers on network motifs (thus possibly the software for it as long as it reproduces the results) are fundamentally incorrect.

ADD REPLY • link updated 6.2 years ago by Ram 44k • written 14.3 years ago by Istvan Albert 101k

0

Entering edit mode

Algorithms in the spirit of PageRank do not need to analyze the network as a whole to return meaningful statistics about any node

ADD REPLY • link 14.3 years ago by Marcin Cieslik ▴ 520

0

Entering edit mode

FANMOD had been excellent for my 4 node motifs. Author Sebestian is very helping and responds real quick.

ADD REPLY • link 11.1 years ago by haider.waseem • 0

Ram · Answer 1 · 2010-07-21

11

Entering edit mode

14.3 years ago

Neilfws 49k

I have not tried it but FANMOD claims to be "much faster than any other available motif detection tool, especially for larger motifs of size > 4". There's an accompanying publication in Bioinformatics.

My usual favourite, igraph, only handles motif size 3 or 4 for some reason and I imagine would be quite slow anyway.

ADD COMMENT • link updated 6.2 years ago by Ram 44k • written 14.3 years ago by Neilfws 49k

score 6 · Answer 2 · 2010-07-21

6

Entering edit mode

14.3 years ago

Jussi ▴ 180

We have successfully used SQL (Oracle) to find the motifs from table containing over 10 million edges. The scalability was very good, the largest pathway was about 500 000 edges and we could easily search for 6 node motifs. We even tried finding 20 node motifs, but then it took days to get the result.

Unfortunately, I do not have the source code, but the implementation is straightforward: 2 node motif is simple select from the edge table. Extra nodes in the motif are implemented using self joins where join condition is based on nodes connecting the edges.

ADD COMMENT • link 14.3 years ago by Jussi ▴ 180

0

Entering edit mode

That's an interesting approach. How much time it took to get the 6 node motifs from the tables of 10 million edges ?

ADD REPLY • link 14.3 years ago by Khader Shameer 18k

0

Entering edit mode

It varies a lot since some motifs or sub-motifs are more common than the others. If the motif does not exist, then the response time was less than a second. I do not remember how long did take to retrieve common motifs and unfortunately the whole database is not available anymore for testing...

ADD REPLY • link 14.3 years ago by Jussi ▴ 180

0

Entering edit mode

That database is not available any more, but I tested similar configuration on my laptop. I created a table for edge data and populated with 600000 random edges of 120 000 nodes. The total size of the table is not relevant, the size of the pathway is more important. So, the result: finding instances of one 6 node motif takes about 0.5 seconds.

ADD REPLY • link 14.3 years ago by Jussi ▴ 180

Ram · Answer 3 · 2010-07-21

6

Entering edit mode

14.3 years ago

Istvan Albert 101k

Those interested in network motifs should also read the paper titled On the origin of distribution patterns of motifs in biological networks written by colleagues here at Penn State. It makes a strong case that the seminal papers on network motifs (thus possibly the software for it as long as it reproduces the results) are fundamentally incorrect

ADD COMMENT • link updated 6.2 years ago by Ram 44k • written 14.3 years ago by Istvan Albert 101k

0

Entering edit mode

Very interesting paper. Thanks for pointing this.

ADD REPLY • link 14.3 years ago by Khader Shameer 18k

score 4 · Answer 4 · 2010-07-21

4

Entering edit mode

14.3 years ago

Lars Juhl Jensen 11k

I have previously used FANMOD with some success. It is in my experience much faster than the alternatives.

ADD COMMENT • link 14.3 years ago by Lars Juhl Jensen 11k