getting fasta sequence of a list of ensemble gene IDs
3
1
Entering edit mode
8.8 years ago
zizigolu ★ 4.3k

Hi,

I retrieve the Ensembl gene IDs of yeast ribisomal RNA from biomart-Ensembl

Ensembl Gene ID
RDN25-1
RDN18-2
RDN5-4
15S_rRNA
RDN37-2
RDN5-6
RDN5-3
RDN58-1
RDN18-1
RDN37-1

I need to download the fasta file of these IDs but in Ensembl I could not such a option. Do you have any suggestion please?

Thank you

gene ensembl sequence biomart • 5.7k views
ADD COMMENT
4
Entering edit mode
8.8 years ago
GenoMax 147k

You can use BioMart. Follow the sequence below.

Ensembl --> BioMart --> Choose Database --> "Ensembl Genes" --> Select Yeast Genome from list --> Filters (in the left pane) --> Gene --> Input external referenced ID list --> Paste ID's in --> Attributes (left pane) --> Sequence --> Select as needed --> Results Button at top of the page --> Export to "File" as "FASTA".

ADD COMMENT
0
Entering edit mode

thank you so much.

sorry I searched for Arabidopsis rRNA in plant biomart, I found the IDs but the sequence was unavalaible then I tried the IDs in NCBI nucleotide but there is also nothing there.

ADD REPLY
1
Entering edit mode

Are your ID's from Yeast or Arabidopsis? The ones in your original post are yeast.

ADD REPLY
1
Entering edit mode

For Arabidopsis rRNA use the following path:

Ensembl Plant --> BioMart --> Choose Database --> Plant Mart --> Select Arabidopsis Genome from list --> Filters (in the left pane) --> Gene --> Gene type (4th option) --> Select "rRNA" --> Attributes (left pane) --> Sequence --> Select features as needed (Unspliced gene may be one option) --> Results Button at top of the page --> Export to "File" as "FASTA".

ADD REPLY
0
Entering edit mode

thank you soooo much genomax2,

my IDs were from yeas that you resolved my problem. then I asked you about Arabidopsis that your tip worked well as already for Arabidopsis too

ADD REPLY
1
Entering edit mode

I'm sorry, I'm very confused. The list are bread yeast IDs. You can easily query and get the sequence, for example. What does Arabidopsis have to do with it?

ADD REPLY
2
Entering edit mode
2.5 years ago

This is one line of code with gget seq:

pip install gget, then simply:

# Command-line
gget seq -o yeast.fasta RDN25-1 RDN18-2 RDN5-4
# Python
import gget
gget.seq(["RDN25-1", "RDN18-2", "RDN5-4"], save=True)
ADD COMMENT
1
Entering edit mode
8.8 years ago
Jenez ▴ 540

I could not find a decent way of automating the process through eutils (which I'm sure there are plenty of people who can show you how), but if you don't mind doing these few manually then you can find the sequence quite easily through both the

ncbi gene database:

http://www.ncbi.nlm.nih.gov/gene/9164935

or SGD's database:

http://www.yeastgenome.org/locus/S000006484/sequence

ADD COMMENT
0
Entering edit mode

thank you Jenez, you right but manually is error prone and somehow time consuming

ADD REPLY
1
Entering edit mode

Funny, automating the workflow I would argue is more error prone seeing how data is rarely standardized across the whole data set you are looking at.

ADD REPLY
0
Entering edit mode

no Jenes, I mean your suggested ways via below is time consuming. about automating, I don't have any programming skill to do so

ncbi gene database or SGD's database

ADD REPLY

Login before adding your answer.

Traffic: 1742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6