Hi there
I am doing some survival analysis on TCGA patients of my interest. I have to draw Kaplan-meier curve, which I am able to do with "survival" package in R using their default "aml" data. Same format I applied on my samples' data and got some curve. But, I have doubt whether I am doing it right way or not.
e.g. survival package has default data format as follows
> aml
time status x
1 9 1 Maintained
2 13 1 Maintained
3 13 0 Maintained
......
and my prepared data format for GBM tissue samples is as follows
time status x
1 11.9333 1 class1
2 4.8000 1 class2
3 18.6000 1 class1
....
I have doubt regarding status tag (middle column i.e. all are 1 ) in my data; as my data does not contain information of censored subjects. I have taken all dead patients and time is reported in months for corresponding dead patients (days_to_death
column). All this data has been taken from clinical_patient_gbm.txt
file. So my question is "Am I MISSING SOMETHING" ? like "Censored data". If yes, then how to obtain censored data from TCGA and from which file (followup file)?
Thanks
Why have you biased the dataset to the dead individuals only? Have a look for
days_to_last_follow_up
or similar to complement your current datasetYou mean he should add the
days_to_last_follow_up
to the current dataset as censored data(status is '0')?