I'd like to generate the permutation data from genotype among 200 samples using PLINK.
To create this file, I ran permutation of case/control labels in a GWAS experiment and recalculate the p-values of all SNPs under all permutations. I did 50 times permutations.
./plink --file 1000G_EUR --assoc --perm --mperm 50 --mperm-save-all --seed 6377474
The input file included MAP and PED file. The first 10 rows and first 10 samples of the MAP file looks like this:
CHROM POS HG00098 HG00100 HG00106 HG00112 HG00114 HG00116 HG00117 HG00118 HG00119 HG00120 HG00122
1 10469 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1 10583 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|1 0|0
1 11508 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1
1 11565 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1 12783 1|1 1|1 1|1 1|1 1|0 1|1 1|1 1|1 1|1 1|1 1|1
1 13116 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1 13327 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1 13980 ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./.
1 14699 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
The first 10 rows and first 10 samples of the PED file looks like this: (the first samples are all control, thus the 6th column are all filled with 1, I have ~200 samples, sample 1-100 are 1, sample 101-200 are 2)
HG00098 HG00098 0 0 0 1 C C G G
HG00100 HG00100 0 0 0 1 C C G G
HG00106 HG00106 0 0 0 1 C C G G
HG00112 HG00112 0 0 0 1 C C G G
HG00114 HG00114 0 0 0 1 C C G G
HG00116 HG00116 0 0 0 1 C C G G
HG00117 HG00117 0 0 0 1 C C G G
HG00118 HG00118 0 0 0 1 C C G G
HG00119 HG00119 0 0 0 1 C C G G
HG00120 HG00120 0 0 0 1 C C G A
The ideal permutation data I'd like to have should look like the following: each row starts with a SNP (rs#), followed by the p-values of this SNP under multiple permutations.
rs422111 0.2025 0.1719 0.8134 0.7883 0.3906 0.849 0.3181 0.1951 0.911 0.6676
rs406028 0.2722 0.1246 0.6835 0.7852 0.4953 0.7191 0.3153 0.1993 0.7748 0.6562
rs391674 0.2355 0.1468 0.7475 0.722 0.4412 0.7832 0.2772 0.2286 0.8423 0.601
rs867389 0.8557 0.003566 0.4266 0.4393 0.3151 0.6239 0.1324 0.8949 0.6167 0.9822
At this point, unfortunately, my results all look like trash..!#%@#%^#@$%! I took the first few rows as example
(A) plink.assoc file
CHR SNP BP A1 F_A F_U A2 CHISQ P OR
1 rs117577454 10469 G NA NA C NA NA NA
1 rs58108140 10583 A NA NA G NA NA NA
B) plink.assoc.mperm
CHR SNP EMP1 EMP2
1 rs117577454 1 1
1 rs58108140 1 1
(C) plink.mperm.dump.all
0 NA NA NA NA NA NA NA NA NA
1 NA NA NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA NA NA
7 NA NA NA NA NA NA NA NA NA
8 NA NA NA NA NA NA NA NA NA
9 NA NA NA NA NA NA NA NA NA
That is an invalid MAP file.
Hi I am in the same situation, is there any to have a mtric with pvalue by SNP for each permutation with a permutation matrix. With plink we can have only fstatistics and now way to have the permutation matrix
Thank you