I want to do a project in which I intend to apply a community detection algorithm on protein protein interaction data to obtain clusters. Then, compare the obtained clusters with experimental protein complexes. Can someone please help me in obtaining data for doing so.
P.S: I am a Computer science student, new to the field of bioinformatics.
If you want to download PPI data one good source is BIOGRID. You can download the dataset in different formats http://thebiogrid.org/download.php. In the flat files you have one PPI interaction per row with the type of experiments that was used in case you want to filter based on the technique.
ADD COMMENT
• link
updated 5.1 years ago by
Ram
44k
•
written 9.1 years ago by
5utr
▴
370
0
Entering edit mode
Thanks, but I also want some protein complex dataset so that I can verify the clusters I obtain.
As source of curated interactions and complexes, you can also use Reactome. Note that the notion of protein complex is a bit fuzzy with not everybody agreeing on composition for a given complex, e.g. some would view a large complex as being made of smaller complexes. This means that the granularity of your clustering may not agree with your chosen reference complexes without being wrong. In addition, a non-negligible fraction of proteins can participate in more than one complex so you should consider soft clustering options.
Thanks, but I also want some protein complex dataset so that I can verify the clusters I obtain.
As a source of curated protein complexes you can use the CORUM mammalian database that has curated protein complexes for human, mouse and rat.