Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
What you're asking for is called "left aligning normalization". It represents variants in the most parsimonious notation and is one of the best practices I've encountered and continue to use all the time.
If you have the VCF file this data comes from and the reference sequence used in the analysis, you can use either bcftools norm (bcftools) or vt decompose | vt norm (vt) to get to where you need from the VCF file. I'd recommend the latter as it makes tracking changes easier by adding OLD_MULTIALLELIC and OLD_VARIANT INFO fields.
If not, it becomes a much more challenging task because you're going to need to compare the REFERENCE sequence and ALT alleles manually to get to your solution.
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.Thanks. From now, I'll use this button. :)