Entering edit mode
8.9 years ago
dam4l
▴
200
Hi,
I have a .txt file with multiple columns, one of which lists SNPs in chr:pos format. Is it possible to convert from chr:pos to rs number using PLINK? If so, what commands are required.
Thanks!
Additionally, I have ~ 15 million SNPs.
Do you have a list with the corresponding rs? Ex:
1:123467932 rs123456789
No I don't. I just have the SNPs in a .bim file with the following columns:
To change the SNP name, you need a list with all SNPs available.
--update-map
and--update-name
to change your SNPs name. WARNING: Some position have more than one SNP.Thanks for your reply! Which settings on the UCSC Table Browser will allow me to download this list?
get output
Select Fields from...
Hi all! I don't know if this is a more recent thing, but I was not able to download the file using the method described here, because the download times out as the file is too big for the table browser. I found the file containing all SNPs in the UCSC directory ( http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp151.txt.gz ).
[UPDATE] But actually, I used only the common SNPs table.
Thanks a lot! I have a question. Does "position" in "CHR:POSITION" SNP IDs refer to chromosome start or end position?
I think it's end, but double check it on UCSC browser by viewing your SNP of interest. But bear in mind that this rule only works for single nucleotide variants (multiallelic variants do not obey this rule).
Thanks so much!
Thank you so much! I followed your way to download the Chromosome/position/rs# information, and ready to do the convertion using plink
--update-name
flag , but realized that there are some positions with multiple rs# and plink just stoped with error. How can I delete duplication from the >100000000 SNPs (I did a whole genome imputation so downloaded SNPs of whole genome as you suggested)?You have two possible solution.
Hi,
I'm kind of facing similar error of duplicate variant ID after using --update-name flag in plink. My error is shown below.
When you are saying for the possible solution of 1) Do not update any SNP with two ore more possible rs. What do you mean by that? How can I do not update the duplicate chr:position rs ID?
Thank you for your help!
re: "2." How do you download the alleles for each SNP from UCSC? and how do you choose the correct rsid from comparing alleles?
where to put the file with chr:pos and rsid?