Getting All The Flanking Regions Around Genes In A Genome?
3
4
Entering edit mode
13.0 years ago
Dror ▴ 280

I want to retrieve all the upstream/downstream regions of all the genes in several genomes. Is there a good map that tell you the exact position of the genes in a chromosome/genome fasta file, so you can easily extract them? for example - map of all the predicted positions of ENTREZ genes?

biomart/ensembl used to be able to do it easily , but now it seems to be broken, and I cant get all the flanking regions of genes any more without getting into errors.

answer for particular organisms won't be helpful, since I am looking for general method to retrieve from very different genomes, like sponges, sea anemones, mammals, insects etc.. so UCSC and similar services will not help in this case.

genome retrieval • 5.9k views
ADD COMMENT
4
Entering edit mode
13.0 years ago
Michael 55k

Uhhargh! I just checked the new biomart 0.8 web-interface, yes it looks more fancy with web 2.0 widgets and stuff but I think usability cannot keep up and the interface seems cluttered.

Maybe that's only because I need to adapt, let's see. Also, I agree that the flank functionality has changes somehow. I could swear (I could be wrong ofc) it was possible to retrieve both upstream and downstream flanks in one go. Now, you can either get upstream or downstream at once, otherwise I get:

Validation Error: For this sequence option choose upstream OR downstream flanking sequence, NOT both.

Anyway, it still works, you just have to run two queries and concatenate the fasta files. And if you don't want to use the new interface, you can use ensemble biomart which is still on the old interface (and I hope they don't migrate....). Here is a link to the query that yields upstream flanks in ensembl biomart. (the functionality to retrieve a stable URL or XML query do document what one was doing, hasn't made it into the new interface either (or is hidden very well))

ADD COMMENT
0
Entering edit mode

btw, sorry for that rant on the biomart UI, you are doing a great job guys.

ADD REPLY
0
Entering edit mode

Oh, I wrote to the mailing list recently on flanking sequence and the new interface, here's what I was told: Hi Mary, The central.biomart.org is still under development and I don't think SNP flank sequence retrieval is implemented yet.

ADD REPLY
0
Entering edit mode

I checked it, it worked for me in both interfaces.

ADD REPLY
0
Entering edit mode

Maybe I need to wait for a while and try biomart again. For now it tends to raise errors for genomic flank sequences. Great application though. Needs some improvements.

ADD REPLY
2
Entering edit mode
13.0 years ago
Lee Katz ★ 3.2k

I'd go into NCBI's FTP interface to download all genomes' GFF and GenBank files. You can then parse the GFF for CDS and their start/stop coordinates. For each start/stop and given a flanking size, extract the genomic sequence.

If you are looking for a GUI to do this, I am not sure if it exists and so you'd have to program it.

ADD COMMENT
0
Entering edit mode

Can you get genomic GFF for all the organisms? I thought it is limited to just a handful of old model organisms.

ADD REPLY
0
Entering edit mode

I'm pretty sure they're all on the FTP site

ADD REPLY
0
Entering edit mode

I'm pretty sure they're all on there. If not, you can at the least download the GenBank files and convert them to GFF using BioPerl or whatever language/framework you want.

ADD REPLY
1
Entering edit mode
13.0 years ago
Bert Overduin ★ 3.7k

The BioMart web interface really is not that well-suited for these genome-wide queries, I am afraid. I would recommend using either the BioMart API or the Ensembl Perl Core API. If you use the former, you can directly use the Perl code that you can retrieve using the [Perl] button on the BioMart web interface once you have defined your query. If you use the latter, you have to write the code yourself, but it shouldn't take too much time as this is a very simple query.

ADD COMMENT
0
Entering edit mode

In the new version 0.8, there is no [Perl] button anymore :(

ADD REPLY
0
Entering edit mode

Hmmmm, I wonder if that is a bug or a feature ....

ADD REPLY
0
Entering edit mode

I guess they are not ready with implementing the rest interface yet, so this option wouldn't make sense.

ADD REPLY

Login before adding your answer.

Traffic: 2012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6