Algorithms Predicting Effects Of Snps / Aa Substitution On Protein
6
13
Entering edit mode
14.0 years ago
Prateek ★ 1.0k

There are many algorithms available online that let you assess the potential effects of a SNP or an amino acid substitution on protein function (phenotype). Some are based on sequence homology alone, some include structure and others use machine learning algorithms to include many different variables.

Which one/ones do you prefer, in terms of scientific basis, ease of use and scale of data handled etc.?

The ones I am aware of are
1. SIFT
2. Polyphen
3. MAPP
4. alignGVGD
5. panther
6. pMUT
7. SNPs3D

algorithm snp protein • 7.6k views
ADD COMMENT
5
Entering edit mode
14.0 years ago
Biomed 5.0k

As a rule of thumb I prefer methods that are based on conservation. One advantage of these is that they are also compatible with non-coding but highly conserved domains whereas the protein structure based methods only work for coding sequences. With more and more whole genome sequences on the horizon I suspect these conservation based methods will be more helpful with intronic and intergenic sequences as well. SIFT is the classic example for those. I also know that protein structure based tools like Polyphen are also adopting the conservation based methodology (i.e. PolyPhen 2) to improve their results.

ADD COMMENT
0
Entering edit mode

I agree - also I think methods that use structure based prediction use it in addition to and not instead of sequence conservation - but I am not sure. By the way, Are you sure SIFT works with non-coding seqs? I think SIFT bases its predictions on amino acid conservation instead of nucleotide conservation.

ADD REPLY
0
Entering edit mode

No SIFT still uses precalculated data for coding-regions and also these tools are trained using snps that are known to be disease causing. So as we get more conservation data from other species and as we collect more disease cuasation/correlation for the non-coding human variations we will have tools like SIFT better predict what other non-coding variations may mean.

ADD REPLY
5
Entering edit mode
14.0 years ago

SNPEffect database is very useful for me to assess various aspects of non-synonymous SNPs. SNPEffect is an integrated resource that run individual sequence and structure based prediction algorithms to predicts whether SNP affect various features.

Sequence features: Subcellular localisation, Turnover rate, Myristoylation, PTS1, Type I Geranylation, Type II Geranylation Farnesylation, GPI Anchor, Acetylation, O-glycosylation, N-glycosylation, Phosphorylation, Propeptide cleavage sites, Signal peptides, Nuclear export signal

Structural features: Aggregation, Stability, Amylogenic regions, Transmembrane regions, Active Sites, Hsp70 Binding

IMHO, this resource can give you more insight in to the effect of SNP in protein beyond the limited information (for example: deleterious effect) provided by tools like SIFT, PolyPhen or PMut.

References: SNPEffect V1, SNPEffect V2

ADD COMMENT
0
Entering edit mode

Khader - bit late on this one, but is there a way to use mysql to access SNPEffect database?

ADD REPLY
0
Entering edit mode

Remember they had a database dump. You should check with the authors.

ADD REPLY
5
Entering edit mode
14.0 years ago
Pablo ★ 1.9k

I created snpEff program some time ago. Please, take a look at it and let me know what you think. Also, let me know if you need a new functionality (I'm actively developing it).

Answering your data scalability question: It can predict functionality for all the SNPs from 1000 genomes project in 18 minutes (using my desktop computer). I think that should be fast enough.

ADD COMMENT
0
Entering edit mode

Pablo, Welcome to BioStar. SnpEff looks very promising, have you written any manuscript about snpEff?

ADD REPLY
0
Entering edit mode

Thank you Khader. I'll be writing a manuscript soon.

Let me know if you have some feedback on snpEff :-)

ADD REPLY
0
Entering edit mode

In the 1000genomes SNP summary page, check the order of the charts at the end of the page.

ADD REPLY
0
Entering edit mode

This tool is exactly what I am looking for! Thank you

ADD REPLY
3
Entering edit mode
14.0 years ago

Since you are interested, have a look at this question. Earlier this morning I was writing exactly about that, maybe you can make some good contribution to the paper.

In any case, I recommend you to read this paper, where the authors predicted the effect on the phenotype of a wide number of non-synonymous snps:

Burke DF, Worth CL, Priego EM, Cheng T, Smink LJ, Todd JA, Blundell TL. Genome bioinformatic analysis of nonsynonymous SNPs. BMC Bioinformatics. 2007 Aug 20;8:301. PubMed PMID: 17708757; PubMed Central PMCID: PMC1978506.

ADD COMMENT
0
Entering edit mode

Giovanni: That's a very interesting paper !

ADD REPLY
0
Entering edit mode

Thanks for sharing. Its interesting to see the fact validated in the paper that most diseases in OMIM are associated with rare SNPs with MAF < .05, even though I see a lot of GWAS studies that for some reason exclude those.

ADD REPLY
2
Entering edit mode
14.0 years ago
Chris ★ 1.6k

You might want to check SNAP too. See http://rostlab.org/services/snap/

Chris

ADD COMMENT
0
Entering edit mode
8.7 years ago

SnpEff has been implemented within the online SNP pipeline: http://sniplay.southgreen.fr/cgi-bin/analysis_v3.cgi

ADD COMMENT

Login before adding your answer.

Traffic: 2202 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6