I have a huge vcf file that I needed to update the SNP names from position (chr1-22:xxxxxx) to rs name in VCF file.
I used SnpSift as suggested here: updating SNP names in VCF file
Now the vcf contains chrme.position AND rsID like so:
22 16050783 22:16050783:A:G;rs587743568 A G . .
I make .bed, .bim and .fam files in plink as I need to do plink analysis like so:
plink --vcf after.SnpSift.vcf.gz --double-id --keep-allele-order --make-bed --out my.plink.data
I now want to extract a list of rsIDs from the bed file like what was suggested here: Plink: Retrieving specific SNP data for individuals in dataset
If my snps.txt is just the rs ID like so:
cat snps.txt
rs587743568
My command fails:
plink --bfile my.plink.data --extract snps.txt --recodeA
Error: No variants remaining after --extract.
If my snps.txt is chrme,position and rsID it works e.g
cat snps.txt
22:16050783:A:G;rs587743568
plink --bfile my.plink.data --extract snps.txt --recodeA
--extract: 1 variant remaining
However, I need for this to work with just using the rsIDs. Can anyone help?