Hello, I have a list of RIKEN IDs; I would like to convert them to FASTAs. The DAVID project does not list RIKEN identifiers as accepted input. What would you suggest?
Hello, I have a list of RIKEN IDs; I would like to convert them to FASTAs. The DAVID project does not list RIKEN identifiers as accepted input. What would you suggest?
My first thought is that most (all?) RIKEN clones are in the NCBI nucleotide database. So given a clone ID such as AK080584, you could go via EUtils esearch/efetch, or use a remote database sequence retrieval utility such as Bioperl's bp_fetch:
bp_fetch net::genbank:AK080584
EDIT
My second thought, now that we've established that you have MGI symbols, not RIKEN IDs, is - you can do this using BioMart. Choose Mus musculus genes as your dataset and under Filters, you'll see MGI symbol as an option. Select sequence retrieval options under Attributes. Search this site for numerous explanations of how to use BioMart if required.
Your example works fine, but unfortunately my IDs look quite different. If I use bp-fetch on one (or even all) of them i get this error: "Sequence 1810008A18Rik in Database genbank in net::genbank:1810008A18Rik is not loadable. Skipping". So maybe they are not really RIKEN identifiers (as thought, in this case I apologize for the improper question), or else not all the RIKEN IDs are included in the NCBI nucleotide database. Of course, if the first option is the true one, I will surely choose your answer as the accepted one :).
use http://www.informatics.jax.org/batch to get a list of segments: chromosome/start/end
then use the UCSC DAS server to download each segment: see http://www.biostars.org/post/show/56/how-to-get-the-sequence-of-a-genomic-region-from-ucsc/
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
could you please provide a list of "RIKENS IDS"
Yes, the IDs I have look like this: 1100001G20Rik, 2610301B20Rik, 4930579J09Rik.
OK; so you have MGI symbols, not RIKEN clone IDs.
Sorry for the mess guys.
And I assume that by "convert to FASTAs", you mean "sequences in FASTA format."
Yes, I mean sequences wich are in FASTA format.