Hello,
I am a novice with plink and working with a .bim, .fam and .bed data with outdated rsids for chrom position. With help with the UCSC genome browser, I have a new .txt file in hopes of updating the rsids by chrom position:
system("./plink --bim oldBim --update-name onlySNPs.uniqLocAndld.txt 2 4 --make-just-bim -out newBim")
.... Unfortunately, I am not getting any updated rsids (it runs through with zero updates). Would anybody know what I may be doing wrong?
The new onlySNPs.uniqLocAndld.txt was generated following these commands to get uniqueness:
curl -O https://hgdownload.gi.ucsc.edu/goldenPath/hg38/database/snp151Common.txt.gz
gunzip -c snp151Common.txt.gz | cut -f 2,4,5,12,17 | grep single.exact | cut -f 1-3 > onlySNPs.tsv
sort -k3 -u onlySNPs.tsv | sort -k1,2 -u > onlySNPs.uniqLocAndId.tsv
Any help would be greatly appreciated.
I see. Thank you. I was under the impression that the end chromosome position in the .txt file was all that was needed to search to replace the rsid with --update-name with correct fields "2, 4" specified. I suppose that isn't correct.
I have updated to plink 2.0 but the documentation for --set-all-var-ids is unclear. How would one use it in this instance? Also, for future reference of "--update-name", how would I reformat my onlySNPs.uniqLocAndld.txt file in order for the --update-name command to work?
You should take a step back and learn how to use some Unix text-processing tools. The most relevant ones are "cut", "paste", "head"/"tail", and to a lesser degree "sed" and "awk" ("sed" and "awk" are somewhat complicated, but you only need to be aware of their existence and learn a few simple usage patterns for now). I can see from your question that you should already be aware of "cut".
After you've done this, you should be able to come up with a coherent way to use --set-all-var-ids in this context.