Question

Permutation using Plink

1

Entering edit mode

9.1 years ago

Molly_K ▴ 60

I'd like to generate the permutation data from genotype among 200 samples using PLINK.

To create this file, I ran permutation of case/control labels in a GWAS experiment and recalculate the p-values of all SNPs under all permutations. I did 50 times permutations.

./plink --file 1000G_EUR --assoc --perm --mperm 50 --mperm-save-all --seed 6377474

The input file included MAP and PED file. The first 10 rows and first 10 samples of the MAP file looks like this:

CHROM   POS HG00098 HG00100 HG00106 HG00112 HG00114 HG00116 HG00117 HG00118 HG00119 HG00120 HG00122
1   10469   0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1   10583   0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|1 0|0
1   11508   1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1
1   11565   0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1   12783   1|1 1|1 1|1 1|1 1|0 1|1 1|1 1|1 1|1 1|1 1|1
1   13116   0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1   13327   0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
1   13980   ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./.
1   14699   0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0

The first 10 rows and first 10 samples of the PED file looks like this: (the first samples are all control, thus the 6th column are all filled with 1, I have ~200 samples, sample 1-100 are 1, sample 101-200 are 2)

HG00098 HG00098 0   0   0   1   C   C   G   G
HG00100 HG00100 0   0   0   1   C   C   G   G
HG00106 HG00106 0   0   0   1   C   C   G   G
HG00112 HG00112 0   0   0   1   C   C   G   G
HG00114 HG00114 0   0   0   1   C   C   G   G
HG00116 HG00116 0   0   0   1   C   C   G   G
HG00117 HG00117 0   0   0   1   C   C   G   G
HG00118 HG00118 0   0   0   1   C   C   G   G
HG00119 HG00119 0   0   0   1   C   C   G   G
HG00120 HG00120 0   0   0   1   C   C   G   A

The ideal permutation data I'd like to have should look like the following: each row starts with a SNP (rs#), followed by the p-values of this SNP under multiple permutations.

rs422111        0.2025  0.1719  0.8134  0.7883  0.3906  0.849   0.3181  0.1951  0.911   0.6676
rs406028        0.2722  0.1246  0.6835  0.7852  0.4953  0.7191  0.3153  0.1993  0.7748  0.6562
rs391674        0.2355  0.1468  0.7475  0.722   0.4412  0.7832  0.2772  0.2286  0.8423  0.601
rs867389        0.8557  0.003566        0.4266  0.4393  0.3151  0.6239  0.1324  0.8949  0.6167  0.9822

At this point, unfortunately, my results all look like trash..!#%@#%^#@$%! I took the first few rows as example

(A) plink.assoc file

CHR                     SNP         BP   A1      F_A      F_U   A2        CHISQ            P           OR
       1             rs117577454      10469    G       NA       NA    C           NA           NA           NA
       1              rs58108140      10583    A       NA       NA    G           NA           NA           NA

B) plink.assoc.mperm

CHR                     SNP         EMP1         EMP2
       1             rs117577454            1            1
       1              rs58108140            1            1

(C) plink.mperm.dump.all

0   NA  NA  NA  NA  NA  NA  NA  NA  NA 
1   NA  NA  NA  NA  NA  NA  NA  NA  NA 
2   NA  NA  NA  NA  NA  NA  NA  NA  NA 
3   NA  NA  NA  NA  NA  NA  NA  NA  NA 
4   NA  NA  NA  NA  NA  NA  NA  NA  NA 
5   NA  NA  NA  NA  NA  NA  NA  NA  NA 
6   NA  NA  NA  NA  NA  NA  NA  NA  NA 
7   NA  NA  NA  NA  NA  NA  NA  NA  NA 
8   NA  NA  NA  NA  NA  NA  NA  NA  NA 
9   NA  NA  NA  NA  NA  NA  NA  NA  NA

plink SNP permutation • 5.2k views

ADD COMMENT • link updated 2.4 years ago by Ram 44k • written 9.1 years ago by Molly_K ▴ 60

1

Entering edit mode

That is an invalid MAP file.

ADD REPLY • link 5.1 years ago by chrchang523 11k

0

Entering edit mode

Hi I am in the same situation, is there any to have a mtric with pvalue by SNP for each permutation with a permutation matrix. With plink we can have only fstatistics and now way to have the permutation matrix

Thank you

ADD REPLY • link 5.1 years ago by mel22 ▴ 100

score 0 · Answer 1 · 2018-04-01

0

Entering edit mode

6.7 years ago

Shicheng Guo ★ 9.6k

I am sure it is caused by your test data since they are not significant. You can change another test data and try to put some significant signals and then you will obtain what you expected.

ADD COMMENT • link 6.7 years ago by Shicheng Guo ★ 9.6k