Hi, we have found a mutation in our WES data analysis. We are wondering if a novel SNV is based on the position of the base, or based on the new base and new protein it changed into?
Let's say for this gene NFKBIA, c.94A>G, p.Ser32Gly had been reported in ClinVar.
In our analysis, we found that our patient has an SNV of c.94A>T, p.Ser32Cys.
Variant novelty always refers to the allele, not to the position. In your example, the c.94A>G change would be known, and c.94A>T would be novel.
From a historical point of view, variant novelty has traditionally referred to dbSNP because it used to contain common variants only, but starting with the inclusion of 1000g variants on version 138 this vision changed. If you measure novelty using ClinVar or any other resource containing disease associated variants you must take into account that you may be using a resource containing really rare variants, therefore you should be cautious with the usage of the novelty information obtained.
Thank you for your explanation that variant novelty refers to the allele and not the position. I have another follow up question.
If I look at the next door SNV at position chr14:35404550 (GRCh38), this SNV has 2 alternate (https://www.ncbi.nlm.nih.gov/snp/rs28933100). Why are both alleles, C>A and C>T, assigned with one rs number?
You have to keep in mind that rs numbers are just tags that dbSNP assigns to describe variants in the genome, an evolving convention to state that “this allele has been seen at this particular position”. Multi-allelic variants do occur, and when this happens it makes sense to merge all the alleles under the same rs number, although each one will have its own frequency and of course its own clinical relevance. This multi-allelic information was rare in the past, as genotyping techniques used to reduce variant detection to bi-allelic sites, but sequencing techniques changed the rules of the game and everything started to be described as it really was.
Dear Jorge,
Thank you for your explanation that variant novelty refers to the allele and not the position. I have another follow up question.
If I look at the next door SNV at position chr14:35404550 (GRCh38), this SNV has 2 alternate (https://www.ncbi.nlm.nih.gov/snp/rs28933100). Why are both alleles, C>A and C>T, assigned with one rs number?
You have to keep in mind that rs numbers are just tags that dbSNP assigns to describe variants in the genome, an evolving convention to state that “this allele has been seen at this particular position”. Multi-allelic variants do occur, and when this happens it makes sense to merge all the alleles under the same rs number, although each one will have its own frequency and of course its own clinical relevance. This multi-allelic information was rare in the past, as genotyping techniques used to reduce variant detection to bi-allelic sites, but sequencing techniques changed the rules of the game and everything started to be described as it really was.