Entering edit mode
20 months ago
Maxine
▴
50
I got a VCF that consists of structural variations via sniffles2. I find the contents in the ALT column sometimes are symbolic SVs, such as <INS> and , instead of nucleotide sequences. The VCF is like this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT cy201704 cy201804 cy201904 cy202304
NC_058089.1 79333062 Sniffles2.INS.DF23M9 T <INS> 37 PASS PRECISE;SVTYPE=INS;SVLEN=7772;END=79333062;SUPPORT=8;COVERAGE=24,25,26,28,37;STRAND=+;AC=2;STDEV_LEN=0;STDEV_POS=0;SUPP_VEC=001000000000 GT:GQ:DR:DV:ID 0/0:0:22:0:NULL 0/0:0:31:0:NULL 1/1:60:0:40:Sniffles2.INS.13174S9 0/0:0:21:0:NULL
However, the nucleotide sequences are what I need for downstream analysis. Did anybody meet a similar problem before?