Hello,
I used the Michigan Imputation Server to impute my dataset, which unfortunately created a lot of random duplicates that have the exact same information (down to the alleles), but are listed as two separate rows with different allele frequencies, R2 values etc.
I converted the results to a ped/map file that has instances of these exact duplicates, down to the allele. For example,
22:17996285:A:ATCTC
22:17996285:A:ATCTC
I need to only keep one of these duplicates in order to move on with my analysis, however the rm-dup command on Plink 2.0 does not have any options that allows me to specifically select which one I want to keep (the force-first option does not work because I do not always want to keep the first one depending on the minor allele frequency).
I do know the line numbers of these rows in the map file. Is there a way in Plink to delete by line number, for example deleting the first one in this example, which is line number 11092 in my file?
If not, is there a way to do this manually?
Thank you