Hi All,
In the Impute2 manual they prefer the standard Impute2 MCMC algorithm for fine-scale imputation of small genome regions but don't explain how to do this. Any help would be appreciated.
I'm using Impute2 to impute the whole genomes of 1000 individuals in 5Mb blocks. WITH pre-phasing I am getting ~90% concordance for chromosomes 1-13, and 85% concordance for chromosomes 14-22. This is before filtering for low MAF etc. Then I ran the impute on two 5Mb blocks without pre-phasing to try and improve the accuracy in these target regions. I only see a 0.2% improvement in accuracy for one block but for the second block the pre-phased impute is more accurate (by 0.2%).
My function for this fine-scale impute is as follows...
eval $IMPUTE \
-g chr"$chr".gen \
-m $REF/genetic_map_chr"$chr"_combined_b37.txt \
-h 1000GP_Phase3_chr"$chr".hap \
-l $REF/1000GP_Phase3_chr"$chr".legend \
-int $start $stop \
-buffer 1000 \
-align_by_maf_g \
-Ne 20000 \
-k 100 \
-o $OUTDIR/chr"$chr".$start.$end.one.phased.impute2 \
-phase
Basically I have just ran Impute2 without the -g prephase_g
, known_haps_g
and use_prephase_g
flags. I have also increased the -k
from 80 SNP and increased the buffer region from 250kb to 1Mb. Is this correct? Has anyone else tried this method?
Thanks in advance,
Lesley