Hi, I have the following problem:
I have expression data which I transformed via log2 and rsn (with lumi). Beside the usage for differential expression etc. I exported the expression of some interesting genes to correlate them with phenotypes of our cohort.
For the transformation of the cohort-traits we usually use the ln.
My question is now if I can correlate ln-transformed traits with log2-transformed expression data? Of course I can but is it correct? I did it with some random data and checked for log2, log10 and ln. The effect direction stays the same but of course the p-value and correlation coefficient differ. Of course it is no problem to transform the traits with lg2 but it would be interesting to know what you guys are thinking about the correlation of two different transformed traits. Is it mathematical correct?
Thanks in advance.
Best,
Tobi
Mathematical correctness doesn't enter into this, this is a statistical question. You have to think from a reader's perspective: would it unnecessarily confuse the reader to have your logarithm base being 2 on one axis and e on the other. Probably. If every other figure in you paper uses log2 for the expression data, I'd probably log2 transform your response traits. (Nonetheless, I don't think the choice of base should affect your p-value...)
I'm not sure I even understand the question. Can someone here rephrase?
Can you please be a bit more specific in what you did not understand. Or is it the whole question?