What are the cons of using hierarchical clustering to construct cohorts for GWAS studies? For example, if I do not trust my target labels (classification labels) (as in the case of mental disorder classifications from the DSM/ICD) and believe the labels to represent false boundaries between complex disease states, can I not just disregard them and perform GWAS on cohorts of patients with mental disorders as defined by a given level "cut" in the hierarchical clustering? Has anyone already done this?
This expectation is certainly not shared by everyone doing GWAS. Alkes Price has down a lot of work showing that GWAS may have failed to show greater evidence of association due to contribution of thousands of markers associated at less than the 5x10-8 threshold commonly used in GWA studies