PLINK dosage gives some "NA"s for dichotomic and continuous outcome after analysis
1
0
Entering edit mode
2.8 years ago
Sebastian ▴ 10

Hi all,

When I perform association analyses on assay data, some SNPs have "NA "s in the results, including p-values. (separators modified for improve visualization of the df)

SNP                       A1 A2    FRQ   INFO     OR     SE      P
rs7285039:49400384:C:T     C  T   0.6271 1.0025 0.9639 0.0967 0.7036
rs62223366:49400394:G:A    G  A   0.8264 1.0081 1.1519 0.1198 0.2380
rs142250457:49400426:C:T   C  T   0.9893 0.8827 0.3574 0.6689 0.1239
rs151251698:49400491:G:A   G  A   0.9980 0.5810     NA     NA     NA
rs71314900:49400521:G:A    G  A   0.9386 0.9300 0.9219 0.2042 0.6905
rs1535123:49400552:A:G     A  G   0.2955 0.9987 0.9141 0.1015 0.3759
rs182615600:49400905:G:T   G  T   0.9974 0.6603     NA     NA     NA

We can quickly notice that the SNPs where there are NAs, the MAF is very small.

Here is the content of the .log file for this analysis (for one chromosome):


PLINK v1.90b6.22 64-bit (16 Apr 2021) Options in effect:

--covar /f12/cov.f12.txt

--dosage /chr22_info0.3_maf001_noheader.impute2.dosage noheader format=3

--fam /f12/f12.fam

--out /f12/f12.chr22

--pheno /f12/pheno.f12.txt

Hostname: cesp-oxygene3.vjf.inserm.fr Working directory: /plink Start time: Fri Feb 11 19:53:55 2022

Random number seed: 1644605635

257542 MB RAM detected; reserving 128771 MB for main workspace.

1674 people (1383 males, 291 females) loaded from .fam.

1674 phenotype values present after --pheno.

Using 1 thread (no multithreaded calculations invoked).

--covar: 12 covariates loaded.

1674 people pass filters and QC.

Among remaining phenotypes, 1366 are cases and 308 are controls.

--dosage: Reading from /f12/chr22_info0.3_maf001_noheader.impute2.dosage.

--dosage: Results saved to /f12/f12.chr22.assoc.dosage .

End time: Fri Feb 11 19:55:28 2022


I have tried the commands (without success)

--vif (from 1 to 10000)

--cell (from 1 to 5)

Do you think this problem is linked with the MAF of those SNPs? do you know a way to force plink to analyse them?

Thank you!

plink imputed dosage analysis association • 1.0k views
ADD COMMENT
1
Entering edit mode

It does seems like it has something to do with MAF. If you check, are all of those SNPs with MAF < 0.01 or close to 0.01?

ADD REPLY
0
Entering edit mode

Yes, they are. I just edited the head of the results to check it easily.

The quality control for the imputed data was MAF > 0.0001 (0.01%).

1) I would like to know why plink refuse to analyse SNP's with a low MAF

2) How to force plink to avoid this?

Thanks.

ADD REPLY
3
Entering edit mode
2.8 years ago
Sam ★ 4.8k

Most likely scenario isn't plink refusing to analyse the SNP, but rather, given the MAF, it is difficult to perform the test. You only have 1674 samples, realistically, given your sample size, for SNPs with MAF so close to 0.01, there's likely only 1 copy of the alternative allele, that's just not enough for statistic analysis. With your sample size, a more reasonable MAF threshold will likely be 0.05 (or 5%).

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6