for a 2 by 2 table, if one entry has value of zero, we can run Firth's Logistic regression using "logistf" R package. what about the outcome or dependent variable is survival time, is there a corresponding R package for running Cox regression model for rare variants or copy number alterations?
thanks you,
Ding
no, I worry that proportional hazards regression model does not fit well to data set with rare variants as predictor variables. for example, for a marker or gene, 3 of 100 patients carry a mutations (1) and all those three patients show long survival time (> 5 years, with no recurrence) compared to the other 97 patients (< 5 years) who do not carry the specific mutation. if considering survival time as a categorical dependent variable, 0 for < five years and 1 for > five years, one would use Firth's logistic regression model instead of regular logistic regression model for those sparse data. However, If one wants to consider survival time as continuous variable and fit a Cox proportional hazard regression model, the hazard ratio would be large or estimated inaccurately. I would think someone has already develop a modified Cox regression model to deal with those data set with rare variant variable as independent variable.