How to deal with warnings from PLINK while performing QC using HAPMAP data ?
1
0
Entering edit mode
4 months ago

I was performing Quality control using the HAPMAP Phase 2 data from PLINK official website and i got this warning when i used this following code

!plink --bfile C:/Users/User/GWAS/HapMap/Data/CEU/Data_prep/Data_prep_1 --geno 0.05 --maf 0.05 --hwe 1e-6 --mind 0.02 --make-bed --out C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11

and the output was

PLINK v1.90b7.3 64-bit (4 Aug 2024)            www.cog-genomics.org/plink/1.9/
(C) 2005-2024 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11.log.
Options in effect:
  --bfile C:/Users/User/GWAS/HapMap/Data/CEU/Data_prep/Data_prep_1
  --geno 0.05
  --hwe 1e-6
  --maf 0.05
  --make-bed
  --mind 0.02
  --out C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11

65315 MB RAM detected; reserving 32657 MB for main workspace.
3967651 variants loaded from .bim file.
90 people (44 males, 46 females) loaded from .fam.
15 people removed due to missing genotype data (--mind).
IDs written to C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11.irem .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 49 founders and 26 nonfounders present.
Calculating allele frequencies... 10111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989 done.
Total genotyping rate in remaining samples is 0.984496.
427810 variants removed due to missing genotype data (--geno).
--hwe: 0 variants removed due to Hardy-Weinberg exact test.
1507581 variants removed due to minor allele threshold(s)
(--maf/--max-maf/--mac/--max-mac).
2032260 variants and 75 people pass filters and QC.
Note: No phenotypes present.
--make-bed to C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11.bed +
C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11.bim +
C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11.fam ... 101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899done.

Warning: 199 het. haploid genotypes present (see
C:/Users/User/GWAS/HapMap/Data/CEU/QC_1/QC_11.hh ); many commands treat these
as missing.
Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands
treat these as missing.
Warning: --hwe observation counts vary by more than 10%, due to the X
chromosome.  You may want to use a less stringent --hwe p-value threshold for X
chromosome variants.

Can anyone help me with how to fix this warning and providing some reason why this warning arises is also appreciated.

Thank you for your help in advance, I'm new to bioinformatics and your help is really appreciated

QC Hapmap PLINK GWAS • 503 views
ADD COMMENT
1
Entering edit mode
4 months ago

Hi,

This warning message is explained here: https://www.cog-genomics.org/plink/1.9/errors

"<number> het. haploid genotypes present (see plink.hh )." This is usually caused by male heterozygous calls in the X chromosome pseudo-autosomal region. Check the variants named in the .hh file; if they are all near the beginning or end of the X chromosome, --split-x should solve the problem. It can also be caused by incorrect sex information and/or an incorrect chromosome set. We strongly recommend addressing this warning as soon as you notice it.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1160 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6