I'm trying to reproduce the results (Table 1) described in Shringarpure et al. - 2016 - Efficient analysis of large datasets and sex bias
The results involved calculation of global admixture proportions on the X chromosomes of both male and female samples, using the newly introduced "haploid" mode. I downloaded the chrX genotypes in PLINK1 format from here. I filtered the genotypes as described in the paper, by removing SNPs with MAF<0.05 and LD thinning pairs of SNPs with r^2<0.1. After that, I ran ADMIXTURE haploid mode using option --haploid=“male:23”. However, the run had not converged after >100 iterations.
This is the script I ran ADMIXTURE:
plink --bfile 1kg_phase1_chrX \
--memory 2000 \
--maf 0.05 \
--make-bed \
--out chrX.maf_0.05
plink --bfile chrX.maf_0.05 \
--indep-pairwise 50 5 0.1
plink --bfile chrX.maf_0.05 \
--extract plink.prune.in \
--make-bed \
--out chrX.filtered \
admixture --haploid="male:23" -s 1 chrX.filtered.bed 2 -j 12
Looking at the log files, it seems that ADMIXTURE had a hard time in finding a local maximum:
276 (QN/Block) Elapsed: 1.811 Loglikelihood: -8.52897e+06 (delta): 1.42774
277 (QN/Block) Elapsed: 1.602 Loglikelihood: -8.52881e+06 (delta): 156.512
278 (QN/Block) Elapsed: 1.685 Loglikelihood: -8.52891e+06 (delta): -99.4092
279 (QN/Block) Elapsed: 1.618 Loglikelihood: -8.52884e+06 (delta): 68.4685
280 (QN/Block) Elapsed: 1.653 Loglikelihood: -8.52883e+06 (delta): 12.983
281 (QN/Block) Elapsed: 1.697 Loglikelihood: -8.52894e+06 (delta): -110.379
What did I do wrong?