Hi all,
We are doing a case-control GWAS. The cases were genotyped with Illumina HumanCoreExome while the control group (both are Europeans) was genotyped with a different platform. We have done several steps to harmonise the 2 cross-platform groups, including:
- QC the case group: remove samples with high missing data, and samples with discordant sex; and remove low-quality variants, etc.
- Do the same for the control group
- Use bcftools fixref (https://samtools.github.io/bcftools/howtos/plugin.fixref.html) to check for reference mismatch and swap the alleles if needed.
- Use bcftools merge and merge only records that are present in both cases and controls
And this was the GWAS result:
I notice something strange with the case group that is the MAFs of those spurious-association SNPs all have a very big difference with those in gnomAD-Europeans and of course with those in the control data. Meanwhile, the MAFs in control data and gnomAD data were perfectly correlated.
So, the weird thing happening in the cases are because of strand swapping? But how as I already checked for reference mismatched?