I am trying to index a vcf file using igvtools. For some reason, I am getting the following error.
Error: htsjdk.tribble.TribbleExpection: The provided VCF file is malformed at approximately line number 5880: Duplicate allele added to VariantContext: GT
When I got to the specific line it looks like the vcf has the reference duplicated in the alteration column. Here is what it looks like
1 19723050 rs9004957 GT G,GT . . RSPOS=19617712;RV;dbSNPBuildID=118;SAO=0;VC=in-del;VLD;VP=050000000005000100000200
When I go into the vcf and fix the line by removing the extra GT in this case, then I get another error about the same issue but just thousands of lines later in the VCF. If this happened just a couple of times I would just manually fix them but there are too many occurrences to do that in this case. I was wondering if there was a way to fix this?
That worked like a charm. I change it a bit to create a new file. Here is what I did for anyone else that encounters this error
change is you just added
old.vcf > new.vcf
to the code