Question

Why do the same genetic variants sometimes result in different amino acids in the REVEL table?

0

Entering edit mode

15 months ago

ManuelDB ▴ 110

Is this an error??

enter image description here

Columns:

enter image description here

Variant 9:131048226G>C produces the amino acid change Q>H. I have confirmed that and this is correct. So, why is exactly the same variant in row number two and number three (in the first image) given two different amino acid changes ??

Do you know where the Gln>His and Val>Leu changes in REVEL may have come from? Could this be an error in the REVEL spreadsheets?

I am asking here first (in case I am not understanding something very basic) before contacting REVEL to show this.

This is happening in the last version (May 3, 2021). I have not checked previous versions

Additionally, why does the transcript ID change when working in the same position? It should be the same, shouldn't it?

This event occurs in table 4332937 times and it is related to the transcripts ID but I don't know how/why

 # counting_mismatches.sh
awk -F ',' '{
    key = $1","$2","$3","$4;
    if (key in arr) {
        if ($6 != arr[key]) {
            print "Mismatch found at", key, ":", $6, "!=", arr[key], $9;
        }
    } else {
        arr[key] = $6;
    }
}' revel_with_transcript_ids

counting_mismatches.sh | wc -l
4332937

Here an example of this mistch

1,1564837,1629457,C,A,N,K,0.212,ENST00000520777;ENST00000357210;ENST00000360522;ENST00000378710;ENST00000355826;ENST00000518681;ENST00000505820;ENST00000504599;ENST00000378708
    1,1564837,1629457,C,A,P,T,0.086,ENST00000514234
    1,1564837,1629457,C,G,N,K,0.212,ENST00000520777;ENST00000357210;ENST00000360522;ENST00000378710;ENST00000355826;ENST00000518681;ENST00000505820;ENST00000504599;ENST00000378708
    1,1564837,1629457,C,G,P,A,0.026,ENST00000514234
    1,1564837,1629457,C,T,P,S,0.308,ENST00000514234

When the transcript ID is ENST00000514234, the amino acid is P and when the transcripts ID are ENST00000520777;ENST00000357210;ENST00000360522;ENST00000378710;ENST00000355826;ENST00000518681;ENST00000505820;ENST00000504599;ENST00000378708 the amino acid is N

REVEL • 1.0k views

ADD COMMENT • link updated 15 months ago by LauferVA 4.5k • written 15 months ago by ManuelDB ▴ 110

1

Entering edit mode

15 months ago

LauferVA 4.5k

Exactly as LChart said - the only thing i want to add here, is, look at your far right column. ENST is an identifier for a human transcript (see http://useast.ensembl.org/Help/View?id=151).

I think the easiest way to go about visualizing this is go to UCSC genome browser (https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr9%3A128276276%2D128288989&hgsid=1682571848_ABPHL3Q6hbxE4942mzYcYpPepKmK), enter in the nucleotide position, zoom out to, say, 50-100 bases, then load relevant transcript Identifiers (Gencode, Refseq) and look at the changes. With approaches like that, you can always answer the question: "is this an error in the Db".

Finally, when variants are named according to a nomenclature like HGVS, the transcript ID is a required identifier for the variant naming to be meaningful. In other words, chr13 94393931 A>T means nothing until you state a transcript isoform. Why? well, what if that variant is in an exon that is not used in the brain, but you are studying the brain. Then the transcript ID or IDs that you are interested in won't be affected by that variant at all, no matter how serious it may be to another transcript of the same gene...

ADD COMMENT • link 15 months ago by LauferVA 4.5k

score 2 · Accepted Answer · 2023-08-20

2

Entering edit mode

15 months ago

amy__ ▴ 190

The transcripts are different so this will sometimes change the sequence of amino acids, for example some transcripts are longer than others so the position may not seem the same. Using the MANE transcript will help a little if you just want to find out the likely change - but this is just my initial thoughts, it may not be correct!

ADD COMMENT • link 15 months ago by amy__ ▴ 190

2

Entering edit mode

To make this perhaps slightly more clear; there may be multiple transcripts (and indeed even multiple genes) at the same position. In the cases where the transcripts have a different reading frame (or even strand!) a single nucleotide change will impact the transcripts in different ways. This is reflected in your table by multiple entries for the same variant, but with different transcript IDs.

ADD REPLY • link 15 months ago by LChart 4.5k