Hi everyone! I plan to run a GWAS using Regenie software (https://rgcgithub.github.io/regenie/). Regenie is an LMM-based GWAS software, that can handle the X-chromosome. I will be generating population covariates in my sample to include as covariates in my GWAS analyses.
My question is: if I am to include the X-chromosome in my GWAS analyses, should I also include SNPs from the X-chromosome in my pruned list of SNPs used to create the population covariates?
I assume by "Population Covariates" you really mean principal components from the genetic relatedness matrix (GRM)? Inclusion/exclusion of the X chromosome should have a mild impact on the GRM (it is a large chromosome), but the one issue is that unless software is specifically written for the X chromosome, the standard normalization (1/sqrt(2p*(1-p))) isn't entirely appropriate as the male population is only haploid. If you can ensure proper normalization, then you should include the X chromosome; otherwise compute the GRM on the autosomes only.
Thanks, both for your help! I was planning to Plink2 to calculate the principal components from the genetic relatedness matrix/population covariates. Would you know if Plink2 appropriately handles the inclusion of the x-chromosome when calculating the principal components? I can't see any information on the pca page about the x-chromosome (https://www.cog-genomics.org/plink/2.0/strat#pca).
agree. for X (and Y if considering Y) you need to account for allele dosage. LChart is right on. +1
Thanks, both for your help! I was planning to Plink2 to calculate the principal components from the genetic relatedness matrix/population covariates. Would you know if Plink2 appropriately handles the inclusion of the x-chromosome when calculating the principal components? I can't see any information on the pca page about the x-chromosome (https://www.cog-genomics.org/plink/2.0/strat#pca).
This comment: https://github.com/chrchang/plink-ng/blob/master/2.0/plink2_matrix_calc.cc#L3509
suggests that it's currently not implemented.