Hello,
I have 15 vcf files for one type of population and 15 vcf files for another type. I want to check the differences between the two, and also the similarities. What changes from one group to another and what remains the same, and a signifcance score if possible.
I have read about PLINK but I am not sure how the pipeline should be. Which steps should I folllow? I read the documentation and it is not clear to me.
I also read about bcftools isec: which is useful to intersect multiple vcf files. So I could merge the 15 vcf files between them and the other 15 vcf files between them and end up with two files: population1_variants.vcf and population2_variants.vcf, and then compare those two against eachother and check for the differences and similarities?
Which approach is better? Is this the way people usually analyze variants among populations? How can I asess significance of the results? Are there any other approaches?
Thank you
Thank you! This seems like a nice approach, and what I was looking for. Would the mydata input be the merged 15samplescase.vcf and 15samplescontrol.vcf ? And those vcf merged should contain only the common variations among each of the 15 samples ?
Thank you