Question

filter VCF by HGVS - indel normalisation

0

Entering edit mode

3.4 years ago

michal.inbar • 0

Hi,

I have a list of variants in HGVS notation (c. and p.) and I want to find them in VCF files. I'm looking for the best way to do so.

Since the HGVS notation can vary (e.g. NM_000059:c.1813_1814insA vs. NM_000059:c.1813dup), I thought I'd use the genomic position as a filter.

I tried using VEP to get the genomic position, using HGVSc as input (online tool). But there's a problem with indels in repetitive area, which can be positioned at different locations.

See https://www.ensembl.info/2018/06/22/cool-stuff-the-vep-can-do-normalisation/ :

"The standard way to report an insertion or a deletion in a VCF file is to write it in terms of the base upstream of it. HGVS works differently, they report the position of an insertion or a deletion in a repeat as the last position within the repeat. Since HGVS notation is in terms of the transcript, this means that for negative-stranded transcripts, the reported position is the same as that that would appear in VCF, but for positive-stranded transcripts, a different position is reported."

I need a way to get the position of the indels from HGVS notation, left-aligned as they would appear in a VCF files. Or - go with a different way of filtering... but which?

Can I solve my problem using VEP? and if not, how?

Example:

HGVS to find: NM_000179:c.3984_3987dup (I also have this info: p.Leu1330ValfsTer12)

VEP VCF position: 48033776 (using NM_000179:c.3984_3987dup as input)

My VCF files position: 48033769 (HGVS appears as NM_000179:c.3987_3988insGTCA )

thanks!

Emily_Ensembl

VCF VEP • 1.0k views

ADD COMMENT • link updated 3.4 years ago by Emily 24k • written 3.4 years ago by michal.inbar • 0

0

Entering edit mode

Emily_Ensembl

Perhaps you missed my question..? thanks!

ADD REPLY • link 3.4 years ago by michal.inbar • 0

score 1 · Answer 1 · 2021-07-01

1

Entering edit mode

3.4 years ago

Emily 24k

I'm afraid the VEP doesn't have the ability to left-shift to VCF style formatting for repeat expansions/retractions. You could try running your VCF through the VEP getting HGVS notation with 3' shifting then use the HGVS for matching.

ADD COMMENT • link 3.4 years ago by Emily 24k