I am using PLINK v1.90b3s 64-bit (17 Jun 2015) to generate a LD matrix from 1000G VCF file for a long list of SNPs.
I use this command to convert VCF to bed file
plink --vcf ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --make-bed --out binary_fileset
I then use this command to generate LD statistic reports
plink --r2 --bfile binary_fileset --ld-snp-list snp_chr22_sample.txt --ld-window-r2 0.8
But it returns the below error
Error: Duplicate ID 'rs10656307'.
The SNP file does not contain this SNP. So I think it is the bed file contain duplicated record of rs10656307. Is there a way to remove duplicated SNP in the bed file?
Thanks, this worked for me
For some reason cut command doesn't work for my bim file. I used
awk -F ' ' '{print $2}' ALL.chr1.bim | sort | uniq -d > 1.dups
instead