Hello,
Let say I've three independent cohorts on which I want to perform GWAS on a specific continous phenotype. This phenotype is not normally distributed so the idea is to first normalize it using rank-based inverse normal transformaton ; and then perform linear regression on genotypes (with e.g. plink).
As I've three cohorts, the idea will be to perform a meta-analysis combing the three cohorts results.
However before the meta-analysis, even before the individual gwas on each cohort, I've a question regarding the phenotype normalization step.
Should I :
- Normalize each cohort independently to each other
- or Merge the clinical table of the three cohorts and normalize the phenotype with all value together.
The resulting normalized phenotypes using either (1 : invRank) or (2 : invRankGlobal) will have different values as cohort_3 has a slightly lower value on average for the phenotype of interest.
Which option should I chose ?
Thanks