I am learning survival analysis in R, especially the Cox proportional hazard model. I read a paper talking about using 80% of the sample as training set and 20% of sample as test set.
As quoted
On the training set, we first performed a pre-selection step to keep the top significant features correlated with overall survival (univariate Cox model, likelihood ratio test, P< 0.05). ...We used two computational methods to train the models: (i) Cox: the Cox proportional hazards model with LASSO for feature selection ...We then applied the models thereby obtained to the test set for prediction, and calculated the C-index using the R package survcomp.
I do not know how they actually did to apply the models from Cox model to the test set. I mean, for the training set, I can simply perform a coxph function. But the returned results are "coef,exp(coef),se(coef)),z,p" and likelihood ratio test p-value. How can I treat this as a model and use it on the 20% test set data?
could you give the reference, please
paper name "Assessing the clinical utility of cancer genomic and proteomic data across tumor types" is on nature biotechnology. Thanks!