I'm trying to create a VCF formatted file from a pairwise alignment of two genomes. Downstream I'm going to use this to calibrate some deep-sequencing data where I know the input sequence (and thus the variants).
I'm just running into an odd edge-case in the VCF format. Following the specifications described in http://samtools.github.io/hts-specs/VCFv4.2.pdf I'm running into the edge-case where there is a "gap" in the first base of the query sequence with respect to the reference:
>Ref
ATTTGT
>Query
-TTTGT
Based on my understanding of the the VCF format I need to put the preceding & current base in the "REF" column and the preceding base in the ALT column. The POS should also be 0?
Any suggestions?
duplicate of Interpreting Gaps at Pos 0 in Terms of VCF
Yup, had the wrong search terms.