Protein-protein interaction (PPI)
2
1
Entering edit mode
7.4 years ago
wonseongsik ▴ 50

Hi all,

I would like to draw protein-protein interaction (PPI) network based on my expression data (RNAseq).

It would be great if I get some quick tips for PPI analysis.

  1. How to decide which gene would be the center of the network?

  2. How (or what statistical methods is used) to calculate size of nodes or each protein selected, distance between proteins?

  3. What package is commonly used for visualization of the network if R is used?

  4. Should the interactions between proteins be based on interaction database like KEGG?

It would be great if I get something like handy tutorials and comments and advice.

Thanks in advance.

SS

RNA-Seq • 4.9k views
ADD COMMENT
3
Entering edit mode
7.4 years ago

How to decide which gene would be the center of the network?

I am not sure what you mean by this. Do you mean center of the layout ? Typically graphs are drawn with a layout algorithm that positions the nodes automatically and sometimes the result is subsequently manually adjusted. Only you can be judged of what makes more visual sense given your data and the question you're trying to answer.

How (or what statistical methods is used) to calculate size of nodes or each protein selected, distance between proteins?

You can make the size of the nodes proportional to any property of the nodes that is relevant to your project. Layout algorithms operate on the weights of the edges so that the distances between nodes give an indication of the differences in weights, i.e. some algorithms would position nodes linked with higher weights closer to each other.

What package is commonly used for visualization of the network if R is used?

I would recommend the igraph package. Have a look at the tkplot() function to manually adjust layouts.

Should the interactions between proteins be based on interaction database like KEGG?

It depends on what you mean by interaction. If you mean physical protein-protein interaction, I would recommend working with protein-protein interaction databases. Even better, combine several of them or use iRefIndex (although it's now two years old) to get more coverage.

ADD COMMENT
0
Entering edit mode

Thank you for your instructive reply. It helped a lot. I was asking which node would be in the center of the network. But you already gave me a good explanation. I'm going to take a look at igraph package, but it seems like there are really a lot to read.

So, can I ask several simple things about igraph? Is it possible to calculate node size, distance between nodes with functions in igraph package?

Thanks again for the resources and database link.

ADD REPLY
1
Entering edit mode

It depends on what you mean by calculate. You can definitely set the node size and color. See this tutorial on network visualization in R. In igraph, the layout is just a matrix of node coordinates. Since layout functions return such a matrix, you can modify it before adding it to the graph (for example with tkplot or computing new values yourself, see the graph layouts section of the igraph manual) or simply supply your own layout function.

ADD REPLY
0
Entering edit mode

What I meant by "calculation" was something like calculating clustering coefficient and coexpression.

I have a couple of following questions. 1. Is there any R package or other tools for prediction of protein protein interaction using Bayesian methods? 2. How do you determine which direction of the arrow edge goes when you construct directed graph of the networks?

Thanks you so much for your informative reply.

ADD REPLY
0
Entering edit mode

igraph can certainly compute various graph and node properties.
The only Bayesian method I can think of right now that can be used to predict interactions is Bayesian networks. There are several R packages that deal with Bayesian networks (e.g. bnlearn, gRain) but I don't know of any that deals specifically with protein-protein interactions.
Protein-protein interaction graphs are undirected because direction is generally not meaningful when considering plain physical interactions. When interaction is meant to be some form of functional association it may make sense to have a direction. How you determine the directionality in such cases is still an area of research especially when using expression data, look for example at the literature on gene regulatory networks.

ADD REPLY
2
Entering edit mode
7.4 years ago
boczniak767 ▴ 870

Hi,

How to decide which gene would be the center of the network?

You need coexpression data for each pair and decide what threshold use. It will define network topology.

Should the interactions between proteins be based on interaction database like KEGG?

You can use one of the existing resurces, like IntAct, STRING or Biogrid, to name a few... or construct the network based on your experiments. I think the most valuable is to use both methods and make composite network.

I think you should check Cytoscape, it doesn't run in R but is widely used.

ADD COMMENT
0
Entering edit mode

Thank you for the comments. May I ask how to determine which genes are coexpressed among 20,000 genes? Is it from gene set enrichment analysis? I have looked at Cytoscape. It looks nice, but the thing is visualization. I would like to draw the network in the way I want...

ADD REPLY
0
Entering edit mode

As of coexpression. I've never do it myself, but most common approach is to compute correlation. Cytoscape is very flexible and allow custom visualisation. In fact it takes some time to use to it, I advise to search for some tutorials.

ADD REPLY

Login before adding your answer.

Traffic: 2392 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6