Hello, this is my first post in this renowned group. Acutally, I am struggling with Haplotype based GWAS. I am doing GWAS on a plant pathogen. The single SNP GWAS has yielded any significant result. So, I have decided to use Haplotype based GWAS. I have a VCF file containing 717045 SNPs. I used Plink to generate haplotype blocks. It has yielded more than 74000 haplotype blocks. But I do not know how to use that data in the GWAS. From the plink output, the data is in .blocks or .blocks.det format.
Can someone please give me some ideas about how to use that data for GWAS? or How can I perform a Haplotype Based GWAS? Using which tool?
Looking forward to your reply. Thank you very much.
Regards Anik Dutta
Thank you very much Kevin. But the problem is, Ihave quite a few SNPs that are multiallelic and I do not want to loose them. So If I import those in Plink they are lost because plink only accepts biallelic SNPs. Do you have any suggestions on how can I keep those SNPs that are multiallelic?
Anik
What if you split the multi-allelic records into individual records? This can be done with
bcftools norm -m-any
This can be done with Vcftools or bcftools? Is bcftools included in Plink? Sorry, if the question is naive. I am new in this field.
No, you would have to split these multi-allelic calls outside of PLINK, and then input the VCF file(s) back into PLINK.
Ok thanks a lot for the information.