Hey,
we are trying to get a local sub-part of dbSNP running on our servers here in our group. Since we are only interested in nsSNPs, we are specifically interested in mappings of rs# to protein sequence, i.e. the concrete RefSeq identifier, the sequence position and the mutant residue. Following the dbSNP handbook from NCBI it seems that the organism-specific SNPContigLocusId tables are of major interest and indeed they have everything that we need. However, those tables only exist for 14 organisms out of overall 100. Does that mean that for the huge majority there don't exist these mappings to protein sequences? If so, why? Or could this information be stored somewhere else in the huge space of dbSNP tables?
Thanks for sharing any insights, Chris
Are you interested in SNPs from all organisms or limited to a subset ? Such mappings are available in various nsSNP annotation database for human, not sure about other organisms.
I'm interested in nsSNPs from all organisms that show up in dbSNP. Human is among the 14 organisms that have the mappings. Thanks, Chris
Hi Chris,
How is your mapping from nsSNP to protein sequence? I am working on a similar project right now. Do you find why only limited mapping from nsSNP to protein sequence?