Entering edit mode
7.7 years ago
cl10101
▴
80
I have a VCF format file, which contains variants found for file mapped to hg38 (reference from GATK hg38 bundle) and I would like to compare this variants with VCF file from 1000 genome project, which is mapped to GRCh37. By comparing this files I mean finding variants shared by both files. Are the coordinates in this files the same or should I somehow convert them?
Thank you. Unfortunately I can't find liftover chain file for g1k_v37 to hg38. I was trying to use hg19ToHg38.over.chain from http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/ but output file is empty and all records were rejected. I found out that g1k_v37 is GRCh37 with slight differences.
hg19=v37 .
you're looking for http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/hg19ToHg38.over.chain.gz if needed convert the chromosomes names by removing the chr prefix .
Thank you for response. I removed char prefix in hg19ToHg38.over.chain file as you suggested and now I have error:
"Exception in thread "main" htsjdk.tribble.TribbleException: Badly formed variant context at location chr1:789016; getEnd() was 789016 but this VariantContext contains an END key with value 724396"
I would be grateful for any suggestion how to solve this problem.
show me
there is no output for 789016
try "grep -Fw 724396 your.vcf"
I got the same problem:
and the corresponding record is: