Is this an error??
Columns:
Variant 9:131048226G>C produces the amino acid change Q>H. I have confirmed that and this is correct. So, why is exactly the same variant in row number two and number three (in the first image) given two different amino acid changes ??
Do you know where the Gln>His and Val>Leu changes in REVEL may have come from? Could this be an error in the REVEL spreadsheets?
I am asking here first (in case I am not understanding something very basic) before contacting REVEL to show this.
This is happening in the last version (May 3, 2021). I have not checked previous versions
Additionally, why does the transcript ID change when working in the same position? It should be the same, shouldn't it?
This event occurs in table 4332937 times and it is related to the transcripts ID but I don't know how/why
# counting_mismatches.sh
awk -F ',' '{
key = $1","$2","$3","$4;
if (key in arr) {
if ($6 != arr[key]) {
print "Mismatch found at", key, ":", $6, "!=", arr[key], $9;
}
} else {
arr[key] = $6;
}
}' revel_with_transcript_ids
counting_mismatches.sh | wc -l
4332937
Here an example of this mistch
1,1564837,1629457,C,A,N,K,0.212,ENST00000520777;ENST00000357210;ENST00000360522;ENST00000378710;ENST00000355826;ENST00000518681;ENST00000505820;ENST00000504599;ENST00000378708
1,1564837,1629457,C,A,P,T,0.086,ENST00000514234
1,1564837,1629457,C,G,N,K,0.212,ENST00000520777;ENST00000357210;ENST00000360522;ENST00000378710;ENST00000355826;ENST00000518681;ENST00000505820;ENST00000504599;ENST00000378708
1,1564837,1629457,C,G,P,A,0.026,ENST00000514234
1,1564837,1629457,C,T,P,S,0.308,ENST00000514234
When the transcript ID is ENST00000514234
, the amino acid is P
and when the transcripts ID are ENST00000520777;ENST00000357210;ENST00000360522;ENST00000378710;ENST00000355826;ENST00000518681;ENST00000505820;ENST00000504599;ENST00000378708
the amino acid is N
To make this perhaps slightly more clear; there may be multiple transcripts (and indeed even multiple genes) at the same position. In the cases where the transcripts have a different reading frame (or even strand!) a single nucleotide change will impact the transcripts in different ways. This is reflected in your table by multiple entries for the same variant, but with different transcript IDs.