I have a list of dbSNP IDs for human chromosome 8 and I need to know which are SNPs and which are indels using biomart?
Is it possible to do so?
I have a list of dbSNP IDs for human chromosome 8 and I need to know which are SNPs and which are indels using biomart?
Is it possible to do so?
Hello,
it seems that this is not directly provided by biomart. What you can do is, select "Variant alleles" in the "Attributes" part. After you received the result, you can split this column by /
and compare the number of characters to decide whether this is a SNP or INDEL.
fin swimmer
You can do this using the Ensembl REST API. Here's and example in R to get the info you're looking for:
library(httr)
library(jsonlite)
library(xml2)
library(dplyr)
## the rsIDs we're interested in
rs_ids <- c("rs1326880612", "rs1008829651")
## define the server and species we're querying
server <- "https://rest.ensembl.org"
ext <- "/variation/homo_sapiens"
## constructe the query based on the IDs we're using
query_body <- paste0('{ "ids" : ["',
paste(rs_ids, collapse = '", "'),
'"] }')
## submit the query
r <- POST(paste(server, ext, sep = ""),
content_type("application/json"),
accept("application/json"),
body = query_body)
## check we got a valid respose from the server
stop_for_status(r)
## extract the vairant name and type
## reformat into a data.frame
lapply(content(r), FUN = function(x) {
data_frame(name = x$name, var_class = x$var_class)
} ) %>%
bind_rows()
And the result...
# A tibble: 2 x 2
name var_class
<chr> <chr>
1 rs1326880612 insertion
2 rs1008829651 SNP
Using rsnps, interface for OpenSNP:
library(rsnps)
ncbi_snp_query(c("rs1326880612", "rs1008829651"))
# Query Chromosome Marker Class Gene Alleles Major Minor MAF BP AncestralAllele
# 1 rs1326880612 1 rs1326880612 in-del DDX11L1 -/C <NA> <NA> NA 10051 <NA>
# 2 rs1008829651 1 rs1008829651 snp DDX11L1 A,T A T NA 10051 <NA>
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
this worked, thank you!
When comparing number of characters, also check if it has a dash:
"-/C"
is a SNP or an InDel?