Hi, all -
Wanted to ask how I'd be able to find the genome build of a PLINK file? I only have the data in BED, BIM, and FAM format, so wanted to see how I'd be able to find its genome build to ensure its aligns before I merge it with another dataset. Thanks!
Thanks so much for that! A few follow-up questions:
1) One of the SNP IDs is rs1839669. When I search that up on dbSNP, I'm lead to the following page: https://www.ncbi.nlm.nih.gov/snp/rs1839669#variant_details. Here when I navigate to "Variant Details" page, I find two options for my build: GRCh37.p13 chr 2 or GRCh38.p14 chr 2. However, how do I now determine which one it is? My base-pair coordinate for that ID is 98157865, which doesn't allign with either of the base-pair coordinates in the two options on dbSNP.
2) What happens if the SNP ID doesn't start with "rs"? For instance, for one of my files, the SNP ID is structured as follows: SNP_A-2242008. How would I determine the genomic build then?
Thanks so much for all your help!
maybe look at another SNP :-)
for a more systematic search turn the bim file into VCF then load it up with dbSNP files for each build, and view it that way in IGV
I think it ought to be clear then what is what.
perhaps your build is an even earlier one ...