Hiya,
I am looking at using Cox Hazard in R using survivor, I have either (mo) 1=died, 0=censored, time in hours (hour), the temperature treatment (temp_c, 3 groups), condition factor (CF, continuous var).
I am confused about handling the temperature group variable and have trialled 2 methods:
data <- read.delim("coxnew.txt", header=TRUE)
data$SurvObj <- with(data, Surv(hour, mo == 1))
(i) I have seen that in example models online, sex: m=0, f=1 or similar, so I used L=1,M=2,H=3 as the groups.
mod <- coxph(SurvObj ~ temp_c + CF , data = data)
and this gave me a summary with two outputs, one for temp_c and one for CF.
(ii) I have also made temperature treatment a factor
data$temp_c <- factor(data$temp_c)
data$temp_c<- relevel(data$temp_c, ref="H")
mod <- coxph(SurvObj ~ temp_c +CF , data = data)
and this gave me a summary with three outputs, one for temp_cL, one for temp_cM and one for CF
I am not sure which is the correct to use, as (ii) requires that you then input one group as a reference? The confusion comes into play when I try and see what temperature plots like, when CF is held at a mean value, as the output graphs look different for the two different methods?
(i)
temp_new <- with(data, data.frame(temp_c= c(1,2,3), CF = rep(mean(CF, na.rm = TRUE), 3)))
or (ii)
temp_new <- with(data, data.frame(temp_c= c("L","M","H"), CF = rep(mean(CF, na.rm = TRUE), 3)))
Does it matter which I use- is it personal preference - or does one version make more statistical sense? I was inclined to go with the first (i) as this compared to the m=0, f=1 style and I assume uses a comparison of three groups among themselves and not just two groups compared to the reference level assigned?
Many thanks, Bekah
Cheers! Okay, hopefully I am correct in interpreting your answer as: If its set as
data$SurvObj <- with(data, Surv(hour, mo == 1))
A hazard ratio of 0.7 for temperature grouping as simply 1,2,3 would be with increasing group number, chance of death (1) over censoring (0) decreases from 1 --> 2 --> 3 temperature treatment.
A hazard ratio of L (5) and M (3) for temperature grouping as factors, with ref level as H would be: Increase in chance of death (1) over censoring (0) for both L and M when compared to H, but chance with temperature L is higher?
Best wishes, Bekah
Yes, 0.7 indicates that, with increasing
temp_c
value, HR is reduced, when adjusted for your condition factor (CF). I am not sure of the exact interpretation of having temperature encoded as a continuous variable of 1, 2, 3 - it would make more sense to be categorical. A continuous temperature variable makes more sense as Kelvin values, or, granted, Celsius.The other values for
L
andM
are readily interpreted. Considering that you setH
as the reference level, it says that the low temperature group has the highest hazard of death, when adjusted forCF
. The medium temperature group also has a higher hazard of death (i.e., higher hazard when compared to high temperature group).You should also be looking at the upper and lower confidence intervals (CIs), and the Log Rank p-value. For example, a general rule of thumb: if we have a HR=0.7 but it's upper CI passes 1.0, then that is less reliable and this will reflect in the p-value.
Thank you so much for all your help! :) this is much clearer now!