Entering edit mode
18 months ago
selplat21
▴
20
I have two files, each of which has the SNP_ID and its allele frequency.
File 1 has a set of 200 SNPs (subset from my genome) and their respective allele frequencies, whereas File 2 is every SNP in the genome along with its allele frequency.
I need to sample 1000 SNPs from file 2, using the distribution of allele frequencies in file 1. In essence I need random SNPs from the larger file (file 2) that are somewhat matched in allele frequency distribution to file 1.
Any help is appreciated!