I can vaguely guess its meaning but want to know, at least, how it works in different situations. Besides, after left alignment, can we write an indel in VCF file in a unique way?
Thank you!
I can vaguely guess its meaning but want to know, at least, how it works in different situations. Besides, after left alignment, can we write an indel in VCF file in a unique way?
Thank you!
In terms of providing an actual answer here for people arriving from google, the short answer is no, left alignment does not address complicated issues with indel representation in VCF files. It does help somewhat, but there are plenty of cases where simple normalization tools do not produce a unique variant representation. (As a side note, left-aligning simple indels is an unstated convention in VCF, and in contrast, HGVS guidelines state that indels should be right-aligned in the context of the genetic transcript).
Since there is no easy answer to representing variants uniquely, the state of the art is to perform variant comparisons using haplotype-aware tools such as rtg vcfeval, hap.py, or vgraph.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Did you see indel left/right alignment and http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_LeftAlignVariants.html?
That might answer the second part of your question.
Thanks, Mike!
After reading your links, I learned that how left-alignment works in a deletion occurring in a repeat region. However, I guess there are more situations. For example, we could have an insertion in a repeat region: ATGATG(GCG)GCGGCGTAGTAG, where (GCG) is an insertion. How would this be handled by left-alignment?