Merging two Plink datasets, but some sites have reversed genotypes
1
0
Entering edit mode
9.4 years ago
devenvyas ▴ 760

I have two datasets from the same type of array in Plink map/ped format. The smaller of the two has about >586,662 SNPs and 93 samples, and the larger of the two has about 620,000 and 934 samples.

I want to merge the two datasets such that I have an intersection of the two (i.e., all 1027 samples but only SNPs present in both data sets).

From an experience a few days ago, I know that for about the larger dataset has about 7000 sites where the alleles are reverse of the large set (e.g., larger set is a C/A polymorphism and the smaller set is a G/T polymorphism). Happily, this array was designed to exclude symmetrical SNPs (A/T or C/G), so fixing this problem is a little less confusing; however, I do not have this list of flipped SNPs.

I know I can flip genotypes in Plink using

plink --file data --flip list.txt --flip-subset mylist.txt --recode

I was wondering, how can I identify these sites and get these sites merged showing the same strand? Thanks

plink SNP • 4.8k views
ADD COMMENT
3
Entering edit mode
9.4 years ago

Convert both filesets to binary, and then use --bmerge to try to merge them. The .missnp file should then list all the loci that need to be flipped.

ADD COMMENT
0
Entering edit mode

To be clear, I would take that missnp file and flip one of the datasets and then re-merge, and then I would be good? (Also, the samples in each dataset are completely different)

Also, it appears that merge mode (http://pngu.mgh.harvard.edu/~purcell/plink/dataman.shtml#merge) produces a union instead of an intersection, but I don't want the SNPs that are only present in one of two. I only want SNPs present in both.

ADD REPLY
0
Entering edit mode

Yes, that's correct, you flip just one dataset and remerge.

You will only get merge conflicts for SNPs present in both datasets.

ADD REPLY

Login before adding your answer.

Traffic: 1826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6