Problem merging data in plink
1
0
Entering edit mode
3.3 years ago
Max ▴ 10

welcome everybody...

In short, this is the error message that appears:

Error: 4 variants with 3+ alleles present.

  • If you believe this is due to strand inconsistency, try --flip with merge-merge.missnp. (Warning: if this seems to work, strand errors involving SNPs with A/T or C/G alleles probably remain in your data. If LD between nearby SNPs is high, --flip-scan should detect them.)
  • If you are dealing with genuine multiallelic variants, we recommend exporting that subset of the data to VCF (via e.g. '--recode vcf'), merging with another tool/script, and then importing the result; PLINK is not yet suited to handling them. See https://www.cog-genomics.org/plink/1.9/data#merge3 for more discussion.

And for the record, I tried to merge this data into EIGENSOFT, but unfortunately the merge also failed and I was getting this message:

*** warning. genetic distances are in cM not Morgans

1 rs1909067 100.026 73942599 T C

Finally this is the command I used in plink:

./plink --bfile data1 --bmerge data2 --make-bed --out merge

I hope to find a solution to my problem.. Thanks in advance

Plink • 3.7k views
ADD COMMENT
1
Entering edit mode
3.3 years ago

Follow the link in the plink error message.

ADD COMMENT
0
Entering edit mode

I did that and nothing changed!

ADD REPLY
0
Entering edit mode

This response doesn't make sense. Read the linked section more carefully.

ADD REPLY
0
Entering edit mode

Believe me I have tried everything and of course the first thing I did was try the solutions provided in the link attached in the error message!

I also tried the solutions provided in this link as well:

Plink-handling multiallelic variant to merge two datasets

anyway, thank you...

ADD REPLY
0
Entering edit mode

Then the obvious thing to do is to actually post what happened when you tried, in enough detail to allow someone else to replicate it. You didn’t do that.

ADD REPLY
0
Entering edit mode

I used these commands (which were mentioned in the attached link in the error message) and nothing changed and I was getting the same error message:

./plink --bfile data2 --flip merge-merge.missnp --make-bed --out source2_trial

./plink --bfile data1 --bmerge source2_trial --make-bed --out Combined_data

Anyway, I found the solution here (in another question)! Data merging was successful using fcgene-1.0.7 and fcgene set the variables automatically and without my manual intervention.

For the second time, thank you very much for the reply.

And I hope that others will find any benefit from what was mentioned here because there are many people facing the same problem and this I noticed when I searched for a solution to my problem...

ADD REPLY
0
Entering edit mode

As expected, you did not actually read the documentation carefully. "If, on the other hand, your 'trial flip' results suggest that strand errors are not an issue (i.e. most merge errors remained), and you don't have much time for further inspection, you can use the following sequence of commands to remove all offending variants and remerge..."

ADD REPLY
0
Entering edit mode

Yes, I do not deny that I was hasty and did not use all the suggestions presented here:

https://www.cog-genomics.org/plink/1.9/data#merge3

Anyway, I took your opinion and used these commands and the merge also worked with plink:

./plink --bfile source1 --exclude merged.missnp --make-bed --out source1_tmp

./plink --bfile source2 --exclude merged.missnp --make-bed --out source2_tmp

./plink --bfile source1_tmp --bmerge source2_tmp --make-bed --out merged11

rm source1_tmp.*

rm source2_tmp.*

Finally, I received this message, I checked the data and it turned out that there were no errors:

512954 variants loaded from .bim file. 1073 people (557 males, 479 females, 37 ambiguous) loaded from .fam. Ambiguous sex IDs written to merged111.nosex . 127 phenotype values loaded from .fam. Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 1073 founders and 0 nonfounders present. Calculating allele frequencies... done. Total genotyping rate is 0.533688. 512954 variants and 1073 people pass filters and QC. Among remaining phenotypes, 0 are cases and 127 are controls. (946 phenotypes are missing.) --make-bed to merged111.bed + merged111.bim + merged111.fam ... done.

ADD REPLY

Login before adding your answer.

Traffic: 1533 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6