I am looking for a way to retrieve DNA sequences from Ensembl May 2017 archive, based on coordinates. I thought using Biomart package would be useful for getting DNA sequences, however, it did not work. Apparently, sequence type (seqType, type) is required for obtaining a sequence using getSequence function.
For example:
seq<-biomaRt::getSequence(chromosome="X", start = 100639991, end = 100644991 , mart=ensembl )
This gives the following error:
Error in biomaRt::getSequence(chromosome = "X", start = 100639991, end = 100644991, :
Please specify the type of sequence that needs to be retrieved when using biomaRt in web service mode. Choose either gene_exon, transcript_exon,transcript_exon_intron, gene_exon_intron, cdna, coding,coding_transcript_flank,coding_gene_flank,transcript_flank,gene_flank,peptide, 3utr or 5utr
Is there a nice way for getting the DNA sequences of a large list of genomic coordinates?
Thank you very much.
Did you check the documentation? Sequence type
is one of the allowed options.biomaRt v2.32.1 is installed which does not allow "genomic" as the seqType. If I try I get the following:
Tagging: Mike Smith to see if he can help.
thanks Emily this was useful,
how I can retrieve archive sequences from older rat assemblie
the code below work, however it retrieve sequences from rno6 (latest rat genome) what i need is rno4 , which is located in the ensemble archive here http://may2012.archive.ensembl.org if I change the server address it gives me errore ! any suggestion ?
Unfortunately we don't have REST archives that old. I also checked the our remapping tools and we don't have mapping between RGSC3.4 and Rnor_6.0.