I did the inputation for missing SNP by MACH with the panel "hm3_r2_b36_fwd.CEU" or "CEU.r22.orig"
There was a FATAL ERROR——Please ensure that allele labels in pedigree are consistent with haplotype file...
It is confused me that whether the #snp(rsXXXX) in both my raw data and panel should be one-to-one correspondence. That is to say, they can not match the same label automatically.
How should I do In this case?
(1) Match the #snp(rsXXXX) manually?
OR
(2) Simply do it without panel mach1 -d sample.dat -p sample.ped --rounds 50 --states 200 --phase --compact
Please give some details about results of the two different methods above.Thanks
Before genotype imputation, you should carry out basic data quality
checks on available genotypes. Typically, we exclude from analysis
markers that have low genotyping success rates (perhaps with <95% of
genotypes called successfully), unexpected evidence for deviations
from Hardy-Weinberg equilibrium (perhaps with an HWE p-value <
0.000001 or so), large numbers of discrepancies among duplicate samples or with several mendelian inconsistensies in available
parent-offspring trios, or that are rare (with MAF < 1% or so). All
these checks are platform and study specific, and you'll have to
figure out what is appropriate for your data. They are mentioned here
as a reminder...
When MACH loads your pedigree and the reference haplotypes, it checks
that allele labels in the two samples are compatible and that allele
frequencies are broadly comparable. If your sample includes no A/T or
G/C SNPs (e.g. because it was genotyped on an Illumina Infinium
platform), you can use the --autoFlip option to ensure that alleles in
the pedigree file and those in the reference haplotypes refer to the
the same strand. If your sample does include A/T and G/C SNPs, you'll
have to ensure they are aligned to the same strand manually and
inspect allele frequency discrepancies identified by MACH to help
pinpoint problems. Although it is typical that a small number of SNPs
will drift in frequency between populations, we recommend that you
read through the warnings generated by MACH. If you see large
frequency discrepancies or anything else suspicious ... investigate!
Newer versions of MACH will automatically ignore any SNPs that are
present in your pedigree file but not in the reference panel. SNPs
that are present only in the reference panel but not in your pedigree
will be imputed!
Using --autoFlip option might resolve your issue.
This option flips alleles in pedigree file according to base pairing
(A<-> T and C<->G) if >2 alleles are found when putting pedigree and
reference together. Notice that it will not affect A/T or C/G SNPs as
strand mismatch won't lead to more than two alleles.
I was wondering if you were successful in solving the above problem, I am also encountering similar problems while imputing some untyped SNPs. Can you please let me know how you solved the error.
Many thanks and look forward to hear back from you.
BTW, it is iMputation not iNputation.