Help with the new nomenclature of multi-nucleotide variants
2
0
Entering edit mode
8 months ago
amy__ ▴ 190

Hi everyone,

I have two separately called variants which have both been annotated by Ensembl VEP. However when looking into these, they both occur in codons adjacent to each other and the protein position is predicted to be the same.

I am trying to figure out what the new 'combined' cDNA change or amino acid change would be predicted to be for this MNV.

These are the variants:

NM_001159279.1(ZNF716):c.443_444del(p.Cys148PhefsTer13) NM_001159279.1(ZNF716):c.444T>A(p.Cys148Ter)

And this is what the bam looks like:

enter image description here

Would it be something like ZNF716 c.443_444delinsAAC ?

Thanks! Amy

bam igv multinucleotide • 1.4k views
ADD COMMENT
0
Entering edit mode
8 months ago
vinayjrao ▴ 250

Hi,

It is not compulsory that both the variants are on the same allele. What if one of the variants comes from the mother and the other from the father? If that is the case, you might not want to combine them. Try looking into techniques like ARMS PCR to sequence both alleles separately. The best way to identify variants more accurately is by triome sequencing.

ADD COMMENT
0
Entering edit mode

I think it's clear from the picture that there is only 1 variant with 2 alleles.

ADD REPLY
0
Entering edit mode

Hi, thanks both for your help - I wondered, if you always see variants together on the same read is it likely these are on the same allele? I can't find many examples or guides of this - or do you know of any compound heterozygous variants that are on different alleles in the same region that have an igv example? Thanks!! Amy

ADD REPLY
0
Entering edit mode

Some issue with nomenclature here.

A single mutational event gives rise to a new allele, either creating a variant in the population where there wasn't one before (for single-nucleotide mutations, this is generally the case), or adding a new allele at the same position (for STR mutations, this is typically the case).

Mutations occur in an individual who, necessarily, harbors additional mutations. Assume a mutation has arisen on the maternally-inherited chromosome. The new allele is therefore "in phase" with the alleles of other variants on the same chromosome, and "out of phase" with the alleles of the paternally-inherited chromsome.

If you see a read spanning multiple variants, (say v1: C->G and v2: A->T), and you always see reads carrying C,T or reads carrying G,A, then the C and T are "in phase", and the G and T are "out of phase". This process is referred to as "physical phasing". Indeed, the closer variants are the less likely it is that a recombination occurs between them, and the more likely that their alleles are in-phase; and so it is also possible to perform "statistical phasing" purely using unphased genotype data.

Compound heterozygosity is a term of art that typically refers to two out-of-phase deleterious alleles in the same gene, but it can also refer to a carrier of two deleterious alleles of the same variant. These cases would confer roughly the same effect as being homozygous for any of the involved mutations.

In this case, the mutation is a single complex event. As such it gave rise to only two alleles, namely the reference allele and c443_447delinsAAC. This mutation by itself is a frameshift, and therefore likely to be deleterious; there may be a separate mutation elsewhere in the gene that is out-of-phase.

ADD REPLY
0
Entering edit mode

Hi LChart,

Thank you that all helps! For example, with these variants that are close together but only seen on different reads, would it be sensible to say that the As and GGs are likely out of phase?

enter image description here

Thanks! Sorry there isn't much information on IGV about interpreting variants close by to each other!

ADD REPLY
0
Entering edit mode

Yes, the A allele of the G/A variant is in phase with the CA allele (and out of phase with the GG allele) of the CA/GG dinucleotide variant.

ADD REPLY
0
Entering edit mode

you always see reads carrying C,T or reads carrying G,A, then the C and T are "in phase", and the G and T are "in phase"

  • is the G and T are “in phase” a typo should it be G and A?
ADD REPLY
0
Entering edit mode

Edited. C,T are "in phase" and G and T are "out of phase"

ADD REPLY
0
Entering edit mode

Hi sorry, the wording is confusing me a little - if I have a set of reads which have two SNVs say 'G' instead of a 'C' and another set of reads which only has 'T' instead of 'A'.

And these two SNVs C>G and A>T are never seen in the same reads together but are in the same region, these would both be out of phase with each other? If they were both seen in the same reads they would be in phase with each other?

Thanks! Amy

ADD REPLY
0
Entering edit mode

The SNVs each have two alleles, the reference and the alternate.

In your case the reference sequence is C-A. You observe only C-A reads and G-T reads. Thus the C and A alleles are in phase, and the G and T alleles are in phase.

Another way of saying "in phase" would be "on the same haplotype".

ADD REPLY
0
Entering edit mode

That helps! Thank you!

ADD REPLY
0
Entering edit mode
8 months ago
LChart 4.6k

Almost. It should be c443_447delinsAAC. The right endpoint of the "del" should span the whole part that was deleted; otherwise there can be ambiguity in homopolymer or homodimer runs. Actually, I do forget if the interval is closed or clopen -- so it could be c443_448delinsAAC.

ADD COMMENT
0
Entering edit mode

Thanks lots, I'll have a look into it!

ADD REPLY

Login before adding your answer.

Traffic: 2397 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6