Hello,
I have single-cell RNAseq samples of cancer cell populations from either a relapse state or just at the diagnostic stage. So there are two states and I would like to characterize the relapse state according to the expression of certain genes. I have previously identified different genes that might be characteristic of a relapse state.
To try to verify this, I plan to apply a logistic regression with the target variable being the state (relapse or diagnostic) and the predictor variables being the expression of the selected genes (about 200 genes). However, I have difficulties in verifying the different assumptions that allow the application of a logistic model, especially concerning the absence of multicollinearity and the presence of a linear relationship between the logit function of the target variable and the predictors.
Are there any particular packages/functions that allow to verify these assumptions?
Hey Ming Tang,
Thanks for your answer.
I'm sorry for the late reply but I wanted to read the article you sent me before replying.
So I read it and I also read about PLS then I have two questions that come to me that you might have an answer to:
PS: Great blog by the way!