Hi,
I got a vcf file and the first I checked was if the first nucleotide given at REF matches the nucleotide in the reference at position POS. This was true for lines with SNPs, indels, and breakends, but not for lines having "< CNV>" as ALT. In those < CNV>-lines the REF nucleotide always matches position (POS+1) in the reference.
In the VCF specification it is written that
If any of the ALT alleles is a symbolic allele (an angle-bracketed ID String “< ID>”) then the padding base is required and POS denotes the coordinate of the base preceding the polymorphism.
Does this mean that if there is a symbolic allele like < CNV>, the nucleotide given in REF is in fact at position (POS+1) and the vcf I received is valid?
Or should the nucleotide given at REF always match the nucleotide at position POS in the reference (meaning the vcf I received contains some invalid lines)?
Thanks
do you have a
END
attribute in theINFO
column when there is a<CNV>
?Yes. Here is an example line:
Nucleotide at position 11174372 on chr1 is C. Nucleotide at position 11174373 on chr1 is A.