Question

question about survival analysis

0

Entering edit mode

5.9 years ago

tujuchuanli ▴ 130

Hi, I choose two genes and perform survival analysis on these two genes. The equation here is coxph(Surv(time,censor) ~ exprs1+exprs2)”, where exprs1 and exprs2 are the TMM normalized expression value for gene1 and gene2. Time is survival time (for dead patients) or last follow up time (for alive patients), censor is dead or alive for each patient.

You can see this plot by clicking this url (https://pan.baidu.com/s/1BDrmHAuAmW6fiyfsSJcHyw).

You can see that the group with low expression of both gene1 and gene2 is the worst. And the group with high expression of both gene1 and gene2 is the best (using the median of expression value as cutoff). However, group with high expression of gene2 and low expression of gene1 is between the lines of the worst and the best (group with high expression of gene1 and low expression of gene2 is the same).

Here is my question. Can I say that there is an interaction between these two genes on survival analysis? If the answer is yes, I can change my equation to coxph(Surv(time,censor) ~ exprs1*exprs2) to see the interaction?

Thanks

survival analysis;interaction • 1.4k views

ADD COMMENT • link updated 4.2 years ago by Biostar 20 • written 5.9 years ago by tujuchuanli ▴ 130

0

Entering edit mode

Hi Kevin,

It is very nice to see you~~.

if I use "coxph(Surv(time,censor) ~ exprs1)" or "coxph(Surv(time,censor) ~ exprs2)" or "coxph(Surv(time,censor) ~ exprs1+exprs2)", the three results are all significant.

However, there are something I didn`t explain clearly in my post. the plot what you see is ploted by "km.coxph.plot(formula.s=Surv(time=time, died) ~ group)" the group here is a factor value which defined four groups in the plot. The cutoff used here is the median value of logCPM counts for gene1 and gene2. below is the output when I use "summary" function to summarize the coxmodel

https://pan.baidu.com/s/1D2z7715PFHLhwLA7_go0AA

ADD REPLY • link 5.9 years ago by tujuchuanli ▴ 130

1

Entering edit mode

Hello tujuchuanli - nice to see you, too.

I see... I thought that your groups were somehow defined by exprs1 and exprs2. In your case, those p-values are produced by comparing the following:

group2 versus group1
group3 versus group1
group4 versus group1

So, group1 is regarded as the reference level. For the model, generally, you could report the Score (logrank) test p-value.

The p-values are very low. I will not ask what are these genes, though.

ADD REPLY • link 5.9 years ago by Kevin Blighe 88k

0

Entering edit mode

Thank you for your replying, Kevin~~

Following your suggestion that interaction is a bit misleading, I`ve changed my words. I considered that these gene pairs as the pairs which combination of two genes with different expression level correlated with different survival outcome, is it better?

BTW, do you think that it is somehow interesting thing?

ADD REPLY • link 5.9 years ago by tujuchuanli ▴ 130

0

Entering edit mode

Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep the thread logically organized.

ADD REPLY • link 5.9 years ago by GenoMax 147k

score 0 · Answer 1 · 2018-12-26

The lines 'cross over', but using the word 'interaction' in this context is a bit misleading - the level of cross-over is also minimal. Also, when you have ~ exprs1+exprs2, you are indicating an additive effect of these genes and that these genes have equal weighting. So, it makes sense for cross-over to occur when you compare HIGH/LOW and LOW/HIGH How does it look if you just run:

~ exprs1

~ exprs2

Also, using the TMM counts, which are on the negative binomial scale, may be biasing the result. Assuming that you have used EdgeR, you should be inputting logCPM counts.