Merging inconsistencies in PLINK
0
0
Entering edit mode
7.2 years ago

I am trying to merge two different PLINK files genotyped on the same platform. However, I found some inconsistencies that could not be solved by the --flip command. I checked the SNPs that could not be merged after the flip and found this type of problem:

File 1:

4    rs10000432    68.93    47511781    T    C

File 2:

4    rs10000432    68.93    47511781    A    C

One of the alleles is different, while the other allele is the same on both files.

What should I do on this case? I wasn't expecting to find these type of issues on these datasets since both were genotyped on the same platform.

SNP PLINK • 3.5k views
ADD COMMENT
0
Entering edit mode

Just an initial comment: for that particular SNP, the ancestral 'reference' allele is C (https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=10000432).

When you attempted to merge the first time, PLINK may have output a file with the extension *.missnp, which would contain these multi-allelic sites. You can remove these from your dataset with the following command:

plink --noweb --bfile MyData1 --exclude MyData.missnp --make-bed --out MyData1.Pruned ;
plink --noweb --bfile MyData2 --exclude MyData.missnp --make-bed --out MyData2.Pruned ;

Then attempt to merge again.

If you don't want to necessarily remove these, then you may have to do more rigorous data preparation. In which format was your data initially - VCF?? It would be useful to run every genotype against a reference genome and ensure that the ref>alt order is maintained.

ADD REPLY

Login before adding your answer.

Traffic: 2209 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6