Question

Find all rsid synonyms for a list of rsids

0

Entering edit mode

6.7 years ago

Jab • 0

Hello,

I have two lists of several thousand rsids. I'd like to compare the two lists to see if there are any common SNPs between them. I am aware that many SNPs have multiple rsids, and this could result in missing matches if I don't have the correct rsid synonym in each list. Is there a way that I can simultaneously download all rsid synonyms for several thousand SNPs using the most recent version of Ensembl without typing each SNP into the search bar individually? I know that similar questions have been asked in the past, but all answers are outdated or the links provided no longer work.

Thank you.

Ensembl SNPs rsids • 3.5k views

ADD COMMENT • link updated 6.7 years ago by Mike Smith ★ 2.1k • written 6.7 years ago by Jab • 0

0

Entering edit mode

6.7 years ago

GenoMax 149k

Links in this answer all appear to be alive: Downloading synonyms for dbSNP rsids (NCBI file in Pierre's answer is from Feb 2018 so pretty current )

ADD COMMENT • link 6.7 years ago by GenoMax 149k

0

Entering edit mode

Thank you. However, the file seems to just be symbols. Is there a special program I need to read and use the file?

ADD REPLY • link 6.7 years ago by Jab • 0

score 4 · Accepted Answer · 2018-06-28

You can do this using R and biomaRt. Here's an example.

First lets create two example sets of SNPS. Between these the first pair are identical, the second pair are synonyms to each other, and the third pair are distinct.

snps1 <- c('rs4844600', 'rs4266886', 'rs6656401')
snps2 <- c('rs4844600', 'rs61737012', 'rs386638846')

Next we load the biomaRt package, and query the variation mart to return all the synonyms and their sources for our first set of rsIDs.

library(biomaRt)
## use the Ensembl variation mart
snp_mart <- useMart(biomart="ENSEMBL_MART_SNP", 
                    dataset="hsapiens_snp")

## get the synonyms and their source for our SNPs
results <- getBM(filters = c('snp_filter'), 
           attributes = c('refsnp_id','synonym_name','synonym_source'), 
           values = snps1, 
           mart = snp_mart)

For reference, the first few rows of results looks like the below. You can filter at this stage if you know you only have synonyms from a certain source.

> head(results)
  refsnp_id             synonym_name synonym_source
1 rs4266886               rs61198255  Archive dbSNP
2 rs4266886 NM_000651.4:c.487+787T>C     dbSNP HGVS
3 rs4266886 NM_000573.3:c.487+787T>C     dbSNP HGVS
4 rs4844600               rs58362463  Archive dbSNP
5 rs4844600               rs61737012  Archive dbSNP
6 rs4844600     NP_000564.2:p.Glu60=     dbSNP HGVS

We can now combine our original set of rsIDs with their synonyms.

snps1_complete <- c(snps1, unique(results$synonym_name))

and then ask which of our second list of IDs is in this expanded list. We see it finds two entries as expected.

> snps2[snps2 %in% snp1_complete]
[1] "rs4844600"  "rs61737012"