Question

predicting survival months of patients in tcga data

0

Entering edit mode

5.4 years ago

akutasame ▴ 40

Using Cox model, we can estimate survival probabilities but I want to predict survival months of the patient using machine learning. Most of the TCGA datasets have overall survival months of patients. Do I still need to use censoring for predicting survival months ?

Thank you

survival tcga os_months predict machine learning • 1.7k views

ADD COMMENT • link 5.4 years ago by akutasame ▴ 40

0

Entering edit mode

The censoring rate is too high (85%) and number of patients around 700. So I won't get too many to train data. Another thing is if tcga data are incomplete, I think predicting overall survival months (continuous) should incorporate censoring. I should not treat it as complete data otherwise all analysis is wrong. Do you agree that Kevin ?

ADD REPLY • link 5.4 years ago by akutasame ▴ 40

1

Entering edit mode

Your intuition is as good as mine, akutasame

ADD REPLY • link 5.4 years ago by Kevin Blighe 89k

score 0 · Answer 1 · 2019-12-22

I would try it with and without the censored patients. For many of the TCGA censored patients, It is impossible to know whether or not they eventually became deceased, and at which point after the study. Check the other variable, Vital Status, too.

Depending on the model / classifier that you are aiming to use, you may even be able to include the censored patients and encode them thus.

Another idea: use the censored patients as a second validation cohort, on which you will apply your classifier.

Kevin