Benchmark hg19 vcf by hs37d5 vcf
1
0
Entering edit mode
3 months ago
ThomasLam • 0

I am conducting a WES VCF benchmark. The benchmark VCF file was generated using the hg19 reference sequence. The high-confidence reference VCF file (using the hs37d5 reference sequence) was downloaded from GIAB website and is in version 4.2.1.

When I use hap.py to benchmark the GIAB VCF file against the hg19 VCF file, hap.py reports an error indicating issues with VCF integrity checks. I have determined that the hg19 reference sequence uses the naming convention chr1, chr2...chrX, whereas the hs37d5 used by the GIAB VCF is named 1, 2, 3...X, Y, which introduces slight differences.

Should I use Crossmap or bcftools to convert the hs37d5 VCF to an hg19 VCF, or is there another recommended approach?

hs37d5 benchmark hg19 GIAB • 488 views
ADD COMMENT
5
Entering edit mode
3 months ago

bcftools annotate --rename-chrs See: VCF files: Change Chromosome Notation

ADD COMMENT
0
Entering edit mode

Updated

Thanks a lot for your information! Make a update, for whom may need it.


First, Prepare a list which has hs37d5 and hg19 convert table.

Pattern will like this Old_chrname\tNew_chrname\n

chr11_gl000202_random\tGL000202.1\n

chrUn_gl000244\tGL000244.1\n

chrUn_gl000235\tGL000235.1\n

Second, Use bcftools and convert table txt file.

Output as vcf.gz

bcftools annotate --rename-chrs convert_table_file Original.vcf.gz | bcftools view -Oz -o Output.vcf.gz

ADD REPLY
0
Entering edit mode

Don't forget to follow up on your threads. If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work. If an answer was not really helpful or did not work, provide detailed feedback so others know not to use that answer.

Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6