I am conducting a WES VCF benchmark. The benchmark VCF file was generated using the hg19 reference sequence. The high-confidence reference VCF file (using the hs37d5 reference sequence) was downloaded from GIAB website and is in version 4.2.1.
When I use hap.py to benchmark the GIAB VCF file against the hg19 VCF file, hap.py reports an error indicating issues with VCF integrity checks. I have determined that the hg19 reference sequence uses the naming convention chr1, chr2...chrX, whereas the hs37d5 used by the GIAB VCF is named 1, 2, 3...X, Y, which introduces slight differences.
Should I use Crossmap or bcftools to convert the hs37d5 VCF to an hg19 VCF, or is there another recommended approach?
Thanks a lot for your information! Make a update, for whom may need it.
First, Prepare a list which has hs37d5 and hg19 convert table.
Pattern will like this
Second, Use bcftools and convert table txt file.
Output as vcf.gz