Question

Retrieving the corresponding Haplotype CDS from Ensembl

0

Entering edit mode

3.5 years ago

Joseph Hughes ★ 3.0k

Given a Ensembl protein identifier and amino acid substitution such as ENSP00000242351:701Q>E,851T>I, how do I programmatically retrieve and download the coding sequence (CDS) with the largest observed count.

Screenshot of the corresponding Haplotype CDS

I need to do this for a batch of different proteins*haplotypes so would like to use the REST API.

gene Ensembl protein Haplotype • 966 views

ADD COMMENT • link updated 3.5 years ago by Emily 24k • written 3.5 years ago by Joseph Hughes ★ 3.0k

score 1 · Answer 1 · 2021-06-23

1

Entering edit mode

3.5 years ago

Emily 24k

This REST API endpoint gets the haplotypes per transcript. The protein haplotypes have the associated cds haplotypes stored as hexes, which you can link to the cds haplotypes.

ADD COMMENT • link 3.5 years ago by Emily 24k

0

Entering edit mode

So would this be taking the hex for the ENSP00000242351:701Q>E,851T>I in 'protein_haplotypes' and finding it as 'other_hex' in the 'cds_haplotypes'?

ADD REPLY • link 3.5 years ago by Joseph Hughes ★ 3.0k

1

Entering edit mode

yes, this would be it. Or you can go the other way, and get the other_hex from the protein_haplotype and find the cds_haplotype it's the main hex for.

ADD REPLY • link 3.5 years ago by Emily 24k