Hopefully this question isn't too specific. I am using the latest release of the human genome in the Ensembl database (homo_sapiens_core_76_38
). I would like to map exons to their dna sequence. The database schema seems to indicate that I can take the seq_region_id
from the exon table and use that to reference the dna table. However there isn't a dna sequence for every exon. For example, the exon with exon_id=28550800
, it's corresponding seq_region_id
does not exist in the dna table. This is my first time using Ensembl, so is there something I'm missing?
Is there a reason you're not just using biomart (that's a query for the exonic sequences of each annotated human exon from release 76)?
So your approach of using biomart will work. It still doesn't solve my problem of how the seq_region_id from the exon table maps to the seq_region_id in the dna table. Although they have the same name, they aren't the same in the database. Just did a sql join between the dna table and exon table based on seq_region_id and it shows that there is no relation between the two tables.