Hi all,
I have different types of data (binary, continuous and ordinal) for 10000 individuals and I would like to combine them into a single variable to do a GWAS. I was thinking to transform everything into a number ranging from 0 to 1 by:
Binary: leaving it as it is Continuous: scaling the values into 0-1 Ordinal: if there is for example 5 categories, the values assigned will be 0, 0.25, 0.5, 0.75, 1 And then at the end, we can obtain a single value by adding all the values.
Do you think this is reasonable? or there's a better way to do it?. If you have any reference with a similar problem it would be of great help
Many thanks
It's not clear that the resulting value would have any biological relevance, I suggest you think about the underlying biology or phenotype and come up with a scoring system that best matches it.
Can you give me an example?
An example wouldn't help you, you need to derive this yourself.
I don't understand what you mean by saying by "come up with a scoring system that best matches a phenotype". Let say you want to do a GWAS of intelligence and you have the results of different exams as quantitative (1 to 10) or ordinal data (very bad, bad, regular, good, excellent). You're saying that makes no sense to combine these variables into one representing the performance of each student??
If I had meant what you wrote then I would have written that. Your question is incredibly general, so any reply to it must necessarily be as well. Combining scores in some way obviously makes sense, it's just that there's no generic way that's appropriate, it will depend entirely on the exact nature of the scores and the phenotype being assessed. What others have done will then be irrelevant unless their scores and phenotype were quite similar...in which case one presumes you would have found it in the literature. Since you presumably didn't find anything similar, you're in uncharted territory.