There are ensembl gene ids which are mapped not on the primary assembly but on an alternative sequence. On the ensembl website (http://www.ensembl.org) it is possible to go to the associated gene mapped on the primary assembly by clicking on "View this gene on the primary assembly". Is it possible to obtain somewhere a table of these associations, at the best for each ensembl release separately?
Based on Emily_Ensembl's answer I constructed a sql statement for querying a genome wide mapping table:
SELECT g.stable_id original_gene_stable_id,sr.name original_gene_seq_region_name,g_mapped.stable_id mapped_gene_stable_id,g_mapped.name mapped_gene_seq_region
FROM gene AS g
JOIN seq_region AS sr ON sr.seq_region_id=g.seq_region_id
JOIN alt_allele AS aa ON g.gene_id=aa.gene_id
JOIN (
SELECT aa_sub.alt_allele_group_id,g_sub.stable_id,sr_sub.name
FROM alt_allele AS aa_sub
JOIN gene as g_sub ON aa_sub.gene_id=g_sub.gene_id
JOIN seq_region AS sr_sub ON sr_sub.seq_region_id=g_sub.seq_region_id
WHERE sr_sub.name REGEXP '^[0-9]+$'
) AS g_mapped ON g_mapped.alt_allele_group_id=aa.alt_allele_group_id;
The gene in the first column is mapped to the gene in the third column which has the same alt_allele_group and a integer as the seq_region_name.
Note: This statement is not valid for Ensembl releases before 73 since the table alt_allele does not contain a column alt_allele_group_id in previous releases.
There's a table of the mapping per gene, but we don't have a whole genome table in that format. The data is stored in the alt_allele MySQL table, which can be linked out to other tables in the MySQL schema to get the data you need. It is also possible get these data via the Ensembl Perl API, which allows you to get all alt_alleles for a gene.
The alt_allele_group_id was introduced to table alt_allele in release 73. What table I could use for mapping ensembl ids on different assemblies before release 73?
ADD REPLY
• link
updated 22 months ago by
Ram
44k
•
written 9.5 years ago by
hendrik
▴
40
0
Entering edit mode
We did not provide mapping back then. Generally, we say that every release is an improvement (as is obvious in this case) so you should always use the current release. If you have to use old data, the best option (although I don't think it's great) is to use gene name.
ADD REPLY
• link
updated 22 months ago by
Ram
44k
•
written 9.5 years ago by
Emily
24k
The
alt_allele_group_id
was introduced to table alt_allele in release 73. What table I could use for mapping ensembl ids on different assemblies before release 73?We did not provide mapping back then. Generally, we say that every release is an improvement (as is obvious in this case) so you should always use the current release. If you have to use old data, the best option (although I don't think it's great) is to use gene name.