Protein Subcellular Localization Prediction
4
4
Entering edit mode
13.0 years ago

Can anyone recommend some methods/tools for predicting the subcellular localization of proteins. e.g. which of a group of genes is likely to be extracellular. Wikipedia has a nice article on the topic: protein subcellular localization

It even lists some tools. These predictors tend to be specialized for proteins in different organisms. Limited info on each tool is provided and several are quite old now. I am interested in any of the following:

  1. A high level explanation of how these tools work or perhaps a citation for a review of informatic methods related to the problem.
  2. Recommended tools for de novo prediction of protein subcellular localization in human and mouse. And why you would recommend that tool.
  3. Databases where these predictions have already been determined for all proteins
  4. Comments on whether it might be possible to identify cases where a single amino acid change (resulting from a mutation perhaps) modifies subcellular localization of the protein.
protein annotation prediction • 7.4k views
ADD COMMENT
0
Entering edit mode

The first two answers posted for this question provide some good insight on the first two of my points above.

ADD REPLY
0
Entering edit mode

Are you looking for subcellular localization data for all proteomes out there or any specific organism or species etc in mind ?

ADD REPLY
0
Entering edit mode

Good point. I meant to specify this in the question. I'm particularly interested in human and mouse...

ADD REPLY
0
Entering edit mode

I believe all four points of my question have now been covered to varying degrees by some great responses. Neilfws reminded me to be more specific about which species I an studying so if anyone has some comments on tools/methods/databases that are particularly relevant to human and mouse ...

ADD REPLY
0
Entering edit mode

I believe all four points of my question have now been covered to varying degrees by some great responses. Neilfws reminded me to be more specific about which species I am studying so if anyone has comments on tools/methods/databases that are particularly relevant to human and mouse that would be appreciated ...

ADD REPLY
0
Entering edit mode

All answers were useful and provide distinct info. I'm giving the check mark to the one pointing to a Nature Protocols paper because it is a good place to start.

ADD REPLY
4
Entering edit mode
13.0 years ago
Lee Katz ★ 3.2k

This paper goes over a really great strategy

http://www.nature.com/nprot/journal/v2/n4/abs/nprot.2007.131.html

Locating proteins in the cell using TargetP, SignalP and related tools

Olof Emanuelsson1, Søren Brunak2, Gunnar von Heijne3 & Henrik Nielsen2

Abstract Determining the subcellular localization of a protein is an important first step toward understanding its function. Here, we describe the properties of three well-known N-terminal sequence motifs directing proteins to the secretory pathway, mitochondria and chloroplasts, and sketch a brief history of methods to predict subcellular localization based on these sorting signals and other sequence properties. We then outline how to use a number of internet-accessible tools to arrive at a reliable subcellular localization prediction for eukaryotic and prokaryotic proteins. In particular, we provide detailed step-by-step instructions for the coupled use of the amino-acid sequence-based predictors TargetP, SignalP, ChloroP and TMHMM, which are all hosted at the Center for Biological Sequence Analysis, Technical University of Denmark. In addition, we describe and provide web references to other useful subcellular localization predictors. Finally, we discuss predictive performance measures in general and the performance of TargetP and SignalP in particular.

ADD COMMENT
0
Entering edit mode

This paper does look like a very useful reference. Hopefully some of these methods have standalone versions as well as 'internet-accessible'

ADD REPLY
0
Entering edit mode

+1 - I have felt for a long time that many of the tools at the CBS from Denmark Technical University are good. One reason for their being good is the use of high-quality training sets.

ADD REPLY
3
Entering edit mode
13.0 years ago
Neilfws 49k

There are lots of SCL prediction tools, but increasingly - there's also a lot of experimental data.

One of the best resources is the LOCATE database. It's a curated database with SCL information for human and mouse proteins derived from the literature, high-throughput immunofluorescence experiments and a computational pipeline for membrane protein annotation.

Just FYI, there's a similar resource for yeast: the yeast GFP fusion localization database.

I suspect that in the near-future, high-throughput experimental methods will supplant prediction, at least for intensively-studied model organisms.

ADD COMMENT
1
Entering edit mode
13.0 years ago

I've tried to classify secreted proteins in my transcriptome assembly before with methods listed in this paper: http://www.ncbi.nlm.nih.gov/pubmed/14724739

[?]Wolf-PSort[?] is the successor to the PSort software described in the paper that will try to predict localization sites.

Most methods you'll find for subcellular localization will depend on signal peptide prediction. Usually using a HMM method.

For my dataset, fortunately there was a previous study that had done a proteomics study on secreted proteins and had discovered a candidate list of proteins. What I found is that only around 20% of the predicted secreted proteins overlapped with the proteomics study.

This could have been just due to the limited taxonomy categories in the various software. There are only options for animal, plants, and fungi for wolfpsort. Perhaps if you made your own set of HMMs based on known mammal signal peptide sequences, you'll have better luck.

ADD COMMENT
0
Entering edit mode

Thanks for another great reference and interesting comments from your own experiences. We have been using Wolf-PSort recently and it seems promising. Part of the reason for posting this question was to see if there are any popular alternatives.

ADD REPLY
1
Entering edit mode
13.0 years ago
Naga ▴ 450
  1. Have you tired MultiLoc2 ? The paper claims that it has a better performance than Wolf-PSORT and other eukaryotic subcellular localization prediction tools. But its 2 years old now.

  2. Though the signal peptides has highly conversed positive-hydrophobic-polar structure for a length of 25 AAs (eg. in the case of general signal peptide), the AA similarity is not so high. So in my guess a neutral point mutation at these signal peptide regions should not affect a signal peptide prediction here.

But, TAT signal peptide (which should have 'RR' in signal peptide sequence) and Lipoprotein signal peptide (which should have 'C' at signal peptide cleavage site) has highly conserved motif/AAs at certain positions, mutation at these positions will definitely affect the subcellular localization of the protein (but if it happens, its not TAT and Lipoprotein signal peptide anymore :) ).

Play with SignalP, TatP or LipoP tools (these are bacterial signal peptide prediction tools) and see how a 'mutation' affects the signal peptide predictions, which eventually determines the final subcellular localization.

Just to add one more point, if the protein's subcellular localization is determined by a signal peptide, mutation at other parts of the protein sequence should not have any effect on the sorting of the protein to its native subcellular localization.

ADD COMMENT
0
Entering edit mode

Thanks. We will check out MultiLoc2. I also appreciate you interesting comments on the 4th part of my question: the potential effect of single amino acid changes on localization.

ADD REPLY

Login before adding your answer.

Traffic: 1634 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6