#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001
1 2827693 . CCGTGGATGCGGGGACCCGCATCCCCTCTCCCTTCACAGCTGAGTGACCCACATCCCCTCTCCCCTCGCA C . PASS SVTYPE=DEL;END=2827680;BKPTID=Pindel_LCS_D1099159;HOMLEN=1;HOMSEQ=C;SVLEN=-66 GT:GQ 1/1:13.9
Length of reference sequence is 70bp (69bp if we remove first base which doesn't belong to deletion), but SVLEN here is 66bp and also if we subtract END with POS, we get -13. Shouldn't SVLEN be 69 and END position 2827762? Or maybe I'm missing someting here.
Yes, I think this is definitely buggy.
I believe this record is supposed to represent rs2376870 (1000 Genomes ID P1M0615101909), which on hg18 should look like:
1 2827694 P1_M_061510_1_909 CGTGGATGCGGGGACCCGCATCCCCTCTCCCTTCACAGCTGAGTGACCCACATCCCCTCTCCCCTCGCA C . . BKPTID=Pindel_LCS_D1099159;END=2827762;HOMLEN=1;HOMSEQ=G;SVLEN=-68;SVTYPE=DEL
I amended the spec, versions 4.1 and 4.2 (draft).
I also fixed the typo on the next line where SVLEN should have been -205, not -105.
Congratulations on having the same master thesis project as me! :-)