Why does the CADD database have multiple lines for the same mutation/substitution with different gene IDs?
1
0
Entering edit mode
2.5 years ago
4galaxy77 2.9k

I grepped out a position 12_111803962_G_A from the CADD database and it returned this

12_111803962_G_A        32      Intergenic      DOWNSTREAM      ENSG00000274697 ENST00000617899
12_111803962_G_A        32      CodingTranscript        NON_SYNONYMOUS  ENSG00000111275 ENST00000261733

I'm confused as to why there are multiple lines for this specific mutation.

If I look it up on NBCI, then it says the mutation is in the ALDH2 gene, as expected. This maps to ENSG00000111275 which is the second entry in my grep results above. However, the first entry maps to ENSG00000274697 which is a different gene, MIR6761, which I believe is next door to ALDH2.

This seems very confusing to me - the position 12_111803962 isn't in the MIR6761 gene, so why does it map to there?

cadd • 583 views
ADD COMMENT
2
Entering edit mode
2.5 years ago
tomas4482 ▴ 430

This variant is a downstream variant of ENSG00000274697, annotated by Ensembl-VEP.

ADD COMMENT

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6