I am performing GWAS analysis on the ADNI dataset (WGS CASAVA Illumina dataset) using the PLINK tool for quality control (QC). However, Alzheimer's disease risk factor SNPs are being removed during the QC steps. I am unsure why this is happening.
Steps followed:
I have an Alzheimer's disease dataset in VCF file format. In the VCF file, the last two columns contain MAXGT and PLOY, representing genotype information. I used only the MAXGT column for GWAS analysis. I indexed and merged the VCF files using bcftools. I converted the merged VCF file into PLINK format.
QC steps applied:
--geno 0.2 and 0.02: This filter removed highly significant SNPs associated with Alzheimer's disease.
--check-sex: There is a discrepancy between PEDSEX and SNPSEX, causing an issue. --mind 0.2 and 0.02 -maf 0.05 --hwe