Hi guys, I'm curious that is it possible to do GWAS/ SNPs association on pooled sequencing data without a control? To my understanding, GWAS/ SNPs association is an analysis comparing SNPs between experiment and control group on every position, so without proper control, every SNP in the experiment group will be significant?
For example, The data I have:
- Experiment-- Three sequencing files, each contains a different number of pooled random EMS-mutagenized drug-resistant flies.
- A stander Drosophila reference .fasta file. The data I don't have:
- Control-- Pooled random EMS-mutagenized drug-non-resistant flies. or simply a random EMS-mutagenized group without drug selection.
I've been calling variants on my pooled sequencing files with CRISP which provides me with estimated allele frequency and the Hardy–Weinberg equilibrium (HWE) test is 0.0 for all SNPs. Is there anything I can do to narrow down the causative genes (SNPs) or region? or all I can do is focus on the SNPs with high allele frequency and hope for the best?
I appreciate all the insight you could provide!