I have a VCF file generated using the SV caller Sniffles2, and I have noticed that some ALTs for insertions are represented symbolically as "<INS>" instead of providing the actual variant sequence in the ALT column.
Initially, I believed that these insertions were represented symbolically due to their length exceeding a certain threshold. To investigate this, I conducted some statistics. However, the results did not support my initial assumption.
The length of symbolic insertions ranges from 2341 to 78565, while the length of non-symbolic insertions ranges from 31 to 49858.
I am curious to know what factors determine whether an insertion is represented symbolically in a VCF file. And, is it possible to obtain the actual variant sequence of symbolic insertions?
I would like to confirm what was just mentioned by LChart (I am one of Sniffles2 developers)
sounds great! waiting for your findings~
thank you for sharing your experience!