Hey everyone,
I'm trying to convert summary statistics from hg19 to hg38 while losing as few as possible SNPs / loci. In most cases, I have SNP identifiers, though in some cases, I have only the chromosome, hg19 positions and ref / alt allele information.
I've tried the following:
LiftOver via the UCSC Unix App.
This converts most of them, but there are about 45,000 loci that are "deleted in new". However, when I check dbSNP, I can find hg38 locations for all the ones I've checked.
SNPnexus
I looked up the remaining SNPs on SNPnexus. This is able to find quite a few of them, but there are still ~10,000 loci that it can't find (that I can find when looking on dbSNP's website).
a local database of dbSNP
I downloaded dbSNP151 (the "common" version due to storage and computational restrictions) to try and find hg38 positions by dbSNP ID. This, however, also did not find a number of SNPs (maybe because it is only common SNPs and these happen to be uncommon ones).
My last resort was to try and webscrape the remaining SNPs from dbSNP, but it feels like my approach has been so convoluted that there surely must be an easier way to do this. Unfortunately I saw that batch queries have been discontinued for dbSNP and the new dbSNP API doesn't seem to be able to solve my question (or I don't understand it).
Does anyone have any tips or resources on what else I could try? Any help is greatly appreciated.
OP is asking to "liftover" data from GRCh37 to GRCh38. Link you posted is for mapping variants to GRCh38. There is no way to lift them over. Unless you clarify how this can be done this answer will be moved to a comment.
Thank you for your answer. I suppose this does work for those with dbSNP IDs - though not for the ones with only hg19 positions given. If there is no approach that can do a liftover, I will definitely be trying this!