I need to query hg19 reference genome and retrieve several 50 nucleotides bases from various positions (exonic, intronic, etc). I found biomaRt in bioconductor package and for testing to check it works properly I use the following command but it returns empty result :
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")
result = getSequence(chromosome=2, start=86374868, end=86374877,type="entrezgene",seqType="gene_exon_intron", mart=ensembl)
I expect to see "acatctcgca" as a result. I think the problem is come from miss configuration in function parameters. Am I using the right tool for this problem ?
I will appreciate if you introduce other tools for querying reference genome through web service.
Hi, Thanks for your response, but I get the error "Incorrect biomart name"
I thinks the problem is not related to version of reference genome, In that case I should get a sequence but always I receive empty result.
Thanks Emily, That resolved the BioMart name problem but when I use the following command to fetch 10 bases it fetch very long sequence, Do you have any idea?
That's not getting you ten bases, that's getting you the gene sequences of every transcript that overlaps that ten bases. BioMart is a gene-centric tool. It cannot be used to get genome sequences.
The REST API is a quick easy way to get sequences of regions.
The REST API is exactly what I was looking for,
Thanks a lot for helping Emily.