Hi,
I have recently converted a VCF file containing 40 samples into Plink format using the Plink --make-bed flag. The file (name: input_data.bim) I'm left with is in the following format:
10 . 0 45265 A C
10 . 0 45402 T C
10 . 0 45781 C CA
10 . 0 46126 G A
10 . 0 46915 T C
10 . 0 47001 CAGAACACAGTAA C
My aim is to have the . value in the second column converted to a dbsnp rsID by cross-referencing the chromosome and position data columns 1 and 4. I have found this previous post a good starting point and am trying to follow the same logic but must be missing something.
I have my .bim, .bed, .bam Plink files, the downloaded dbsnp153.txt file from UCSC Genome Browser which included all fields by default but I've modified it to the below format (filename: hg38_dbsnp153_final):
#chrom chromStart name
1 10177 rs367896724
1 10352 rs555500075
1 11007 rs575272151
1 11011 rs544419019
1 13109 rs540538026
1 13115 rs62635286
I then run the following:
sudo plink1.9 --bfile input_data --update-name hg38_dbsnp153_final --make-bed --out mydata
Resulting int the following duplicate ID error:
PLINK v1.90b6.16 64-bit (17 Feb 2020) www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to mydata.log.
Options in effect:
--bfile input_data
--make-bed
--out mydata
--update-name hg38_dbsnp153_final
128894 MB RAM detected; reserving 64447 MB for main workspace.
35624 variants loaded from .bim file.
58 people (0 males, 0 females, 58 ambiguous) loaded from .fam.
Ambiguous sex IDs written to mydata.nosex .
Error: Duplicate ID '.'.
Can anyone suggest a way in which I can resolve this and assign dbsnp rsids to the currently blank second column of my .bim file?