ENSEMBL REST API: get homology ID
1
0
Entering edit mode
8.4 years ago
ana16 ▴ 10

Hi everyone, I am looking to get all the ortholog sequences from ENSEMBL in FASTA format, given a human gene ID. I found ENSEMBL REST API and think this can perform what I am looking. I was able to get some output using:

wget -q --header='Content-type:text/xml' 'https://rest.ensembl.org/homology/id/ENSG00000157764?sequence=cdna;type=orthologues'  -O -

However, the output contains a lot of additional information (such as headers, descriptions) and also several times the human gene alignment. A small example is shown below:

d":"ENSG00000157764"},"dn_ds":null,"target":{"perc_pos":22,"protein_id":"ENSP00000309597","taxon_id":9606,"cigar_line":"

I would like to simply get each ortholog in a nice FASTA file, starting with the >Homo_sapiens query.

>Homo_sapiens_geneID
ATGTTATATG
>mus_musculus _OrthologID
ATGTTAAATG

Is there any post-processing that I could apply to this file in order to get what I am looking for? Or is there any other program that could do a similar approach (input: could be several ENSEMBL ortholog IDs and retrieve their cDNA in FASTA format)?

Thank you very much for your help, I appreciate your feedback.

Ana

ENSEMBL API alignment homology • 1.8k views
ADD COMMENT
3
Entering edit mode
8.4 years ago

The Ensembl perl API allows you to get just the data you need.

ADD COMMENT

Login before adding your answer.

Traffic: 2096 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6