Hi all,
When I perform association analyses on assay data, some SNPs have "NA "s in the results, including p-values. (separators modified for improve visualization of the df)
SNP A1 A2 FRQ INFO OR SE P
rs7285039:49400384:C:T C T 0.6271 1.0025 0.9639 0.0967 0.7036
rs62223366:49400394:G:A G A 0.8264 1.0081 1.1519 0.1198 0.2380
rs142250457:49400426:C:T C T 0.9893 0.8827 0.3574 0.6689 0.1239
rs151251698:49400491:G:A G A 0.9980 0.5810 NA NA NA
rs71314900:49400521:G:A G A 0.9386 0.9300 0.9219 0.2042 0.6905
rs1535123:49400552:A:G A G 0.2955 0.9987 0.9141 0.1015 0.3759
rs182615600:49400905:G:T G T 0.9974 0.6603 NA NA NA
We can quickly notice that the SNPs where there are NAs, the MAF is very small.
Here is the content of the .log file for this analysis (for one chromosome):
PLINK v1.90b6.22 64-bit (16 Apr 2021) Options in effect:
--covar /f12/cov.f12.txt
--dosage /chr22_info0.3_maf001_noheader.impute2.dosage noheader format=3
--fam /f12/f12.fam
--out /f12/f12.chr22
--pheno /f12/pheno.f12.txt
Hostname: cesp-oxygene3.vjf.inserm.fr Working directory: /plink Start time: Fri Feb 11 19:53:55 2022
Random number seed: 1644605635
257542 MB RAM detected; reserving 128771 MB for main workspace.
1674 people (1383 males, 291 females) loaded from .fam.
1674 phenotype values present after --pheno.
Using 1 thread (no multithreaded calculations invoked).
--covar: 12 covariates loaded.
1674 people pass filters and QC.
Among remaining phenotypes, 1366 are cases and 308 are controls.
--dosage: Reading from /f12/chr22_info0.3_maf001_noheader.impute2.dosage.
--dosage: Results saved to /f12/f12.chr22.assoc.dosage .
End time: Fri Feb 11 19:55:28 2022
I have tried the commands (without success)
--vif (from 1 to 10000)
--cell (from 1 to 5)
Do you think this problem is linked with the MAF of those SNPs? do you know a way to force plink to analyse them?
Thank you!
It does seems like it has something to do with MAF. If you check, are all of those SNPs with MAF < 0.01 or close to 0.01?
Yes, they are. I just edited the head of the results to check it easily.
The quality control for the imputed data was MAF > 0.0001 (0.01%).
1) I would like to know why plink refuse to analyse SNP's with a low MAF
2) How to force plink to avoid this?
Thanks.