I want to perform a regression to check the correlation between two phenotypes. Both phenotypes are age-dependent, so I need to correct for this.
There is one problem, however: the two phenotypes were not recorded at the same time and I don't have data available for the same age per individual. Also, the time span between the two records differs between individuals.
Basically, I have the following data:
- phenotype A
- age when phenotype A was recorded
- phenotype B
- age when phenotype B was recorded
And I want to perform regression of phenotype A on phenotype B, while correcting for age. I'm a bit unsure how to go about this.
My thought was, maybe I should regress phenotype B on 'age when phenotype B was recorded' first, then use those residuals in the regression while correcting for 'age when phenotype A was recorded'. Would this be a correct way to go about this?
I appreciate any input!
If there are more than 5 factor levels for age, you could perhaps try adding age as a random effect.