The coxph function of survival
package details describes:
The routine internally scales and centers data to avoid overflow in the argument to the exponential function. These actions do not change the result, but lead to more numerical stability. However, arguments to offset are not scaled since there are situations where a large offset value is a purposefully used. In general, however, users should not avoid very large numeric values for an offset due to possible loss of precision in the estimates.
How to understand "The routine internally scales and centers data"? Dose this means coxph
will scale our input data before processing?
If not, how to scale data for very different variables? For example a dataset with continuous variables age, riskScore, GeneA_FPKM
and category variables gender
(convert to 0 and 1) and tumor stage
(convert to 0, 1, 2, 3, 4). Shall I apply scale
function to all variables or only those continuous variable?