Entering edit mode
18 months ago
Can Abdullah
•
0
I have 867 VCF files containing variants to annotate with hg38. The problem is that the chromosome positions of variants in the files are obtained based on hg19. Whenever I perform an annotation, I cannot call the necessary information because of this reference genome mismatch problem. Now I have to find a way to get the variants suitable to be annotated with hg38.
What tools can I use for this?
crossmap
: https://crossmap.sourceforge.net/Picard liftoverVCF: https://gatk.broadinstitute.org/hc/en-us/articles/360037060932-LiftoverVcf-Picard-
Looks like @Pierre also has: http://lindenb.github.io/jvarkit/VcfLiftOver.html
thanks. and you'd better use picard, it checks the alleles are ok with the new reference. https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/liftover/VcfLiftOver.java#L129
Thanks a lot, I have performed the conversion with the Picard.jar I downloaded from GATK Broad Institute website. It converted most of the variants correctly, as I cross-checked by using UCSC Tool (https://genome.ucsc.edu/cgi-bin/hgLiftOver). For example, for a VCF file for 260 variants, it only converted 254 of the positions, while UCSC tool converted all of them.
My question is, what should I do with this un-lifted over variants by Picard? After I convert the positions, I will perform an annotation to make a prediction of these variants' phenotype and causativeness. Considering these, which tool is more trustworthy?
Thank you.