Hello
I would like to find biomarkers that correlate with geometric mean of cell size and gene expression. I have not found any article that did something similar yet. I tried Spearman's correlation test with TMM normalized expression and one of the most correlated genes was, indeed, a canonical regulator of this phenotype. I assume other significantly correlated genes could also be biomarkers of the phenotype. Because my data comes from humans and has heterogeneity, I intend to stratify this analysis by other phenotypes I am not interested in, and choose genes that are correlated in all, or at least most analyses. To reduce the number of genes to be analyzed I performed the analysis only in the 5000 genes with most variance and with expression of at least 25TPM in 50% of the samples. Does that sound adequate?
Also, is there any reason I should use linear regression or WGCNA instead? I am assuming an univariate linear regression will yield similar results compared with Spearman correlation. From what I have read about WGCNA and as a colleague showed me, the continuous variables are turned into traits, which I believe can be a problem since the cell size range is pretty wide.