I have a very large VCF file where the 'ID' column is a unique ID comprising of 'chr:bp'. I would like to update the 'ID' column to dbSNP IDs.
I have downloaded a bed file [chr, from, to, rsid], which I have sorted and tabix indexed. The bedfile is for hg19, which is correct for my data, chromosomes are formatted with 'chr' and their is no header.
It seems that the BCFtools annotate function does allow 'ID' column to be updated, but I am not clear how. I have tried;
i) bcftools annotate -a dbsnp.bed.gz -c 'CHROM,POS,-,ID' my.vcf.gz
ii) bcftools annotate -a dbsnp.bed.gz -c 'CHROM,FROM,TO,ID' my.vcf.gz
neither of which updated the ID column. I also tried removing the 'ID' first from the VCF -R
, and piped the vcf into the two commands above. Perhaps this is not the right tool? Any advice appreciated.
This worked. It was my chromosome names that were the issue. I thought the bed file has to have 'chr1' format in order to be tabix indexed.
@mrxcm3,
How do you fix chromsome names?
bcftools annotate with the --rename-chrs parameter