I'm running the imputation on some WGS VCF files, and I've never done it before, so I started with Chr 22 (GRCh37). I'm just worried because I started with 594949 SNPs in the file, and after the prep (using the Will Rayner perl script recommended here https://imputationserver.readthedocs.io/en/latest/prepare-your-data/ ) there's only 243108 for input into the imputation server, which is a high attrition rate! I tried to do some poking, though I don't know perl very well... best I canfigure is that most of the ones removed do not have an RSid.
My question: Is it OK to have such a reduced set of SNPs before imputation? Should I try to put on an rsid before running this? Thanks!