There are two VCF files that I like to merge them, using GATK or VCFtools. The problem is, they have different chromosomal notation, one has Chr, the other does not. This question could be similar to this one
Is there any quick awk/sed commands that you suggest ?! Also I appreciate if you make comment, which of these two (GATK/VCFtools) is more reliable for this task.
the awk-based answers below are confusing. Just use
bcftools annotate --rename-chrs
as highlighted by @jerviedog. This will also work with appropriate subsets of NCBI's assembly_report.txt filesI am very new at this and ran into a similar but slightly more complicated problem today with the Cryptococcus genome. I think I solved it thanks to help and links posted here (and didn't find a solution elsewhere) so thought I should post it here in case someone comes along with a similar problem. The reference genome I use does not use either numerical (1, 2, 3) or chr (chr1, chr2, chr3) notation, it has wacky chromosome names (CP003827, CP003822 etc.). So to replace my chromosome names in a vcf file to make them numerical I used a series of grep commands in awk:
I had no knowledge of awk before stumbling onto this post so there might be a more elegant way to do this, but this seems to work, which is good enough for me!