Hello All,
I am working on submitting novel SNPs to dbSNP. When I use their online VCF validator tool I get the following error:
##ERR_REF_MISMATCH=Ref allele mismatch. Fix: need to match the reference genome on the FORWARD orientation
(Expect: T, Found: G)
I checked the strand information for the variant called and it shows "+". So if the variant called is on plus strand then why does the tool throw mismatch error?
Can anyone help me in understanding this concept? Should I just change the allele from a G>A to T>A ? which I am not sure is a good idea.
Thanks in advance!
Hello,
those discrepancy is most likely due to different reference genomes. Which one did you use for alignment and variant calling? Which one is expected by dbSNP Validator?
fin swimmer
I have used hg19 which is the same as used by dbSNP validator.
Could you post the corresponding line of your vcf file?
here you go:
Hello again,
this is not a valid
vcf
line. It has more similarity with abed
file. So where does it come from?fin swimmer
this is the vcf format!
Yes, that's better :)
I'm not familiar with the dbSNP Submission validator. But in the docs it is stated out, that one have to provide the GeneBank Accession Number of the reference genome used. Double check if you realy used the correct one.
In hg19 there is a
G
on the position you show, but in hg38 there isT
like the validator says. You can see what GeneBank Accession Number are available on this site in theHistory
part.fin swimmer
Yes, I did check the GeneBank Accession for hg19. I am using the correct one (GCF_000001405.25).
Could you post the full header of your vcf?
That the header I am using: