Survival analysis: Why to categorize age?
1
1
Entering edit mode
4.8 years ago

Hi there!

I wonder ed why scientists usually categorize age in survival analysis? For example, a 51-year-old patient is in the same category with a 60-year-old, but a 61-year old in a different category from the 60? (Supposing dividing the data in 10-years intervals). Why age is not considered continuous? A reference (textbook) will be appreciated as well.

survival • 1.5k views
ADD COMMENT
1
Entering edit mode

The only reason may be if the function response(age) is complex and can not be easily modeled. Otherwise it is a "mistake" to categorize anything since it leads to the loss of power.

Ok, another reason - clinicians want to have a clear simple cut off, so before age x risk is low, after - high

ADD REPLY
2
Entering edit mode

This was my "feeling".

I also thought that if there is something mysterious leading statisticians/bioinformaticians to categorize data, one may use "sliding windows"

0-9 ; 10-19; .....

1-10 ; 11-20; ....

.

.

9-18 ; 19-28; .....

and then recheck the results.

ADD REPLY
3
Entering edit mode

If you want to know what statisticians think, you may read this thread https://stats.stackexchange.com/questions/16565/what-is-the-effect-of-dichotomising-variables . However, medical people think differently :)

ADD REPLY
0
Entering edit mode

Useful link. Thanks a lot

ADD REPLY
3
Entering edit mode
ADD REPLY
1
Entering edit mode

Thank you. Valuable information.

ADD REPLY
1
Entering edit mode
4.8 years ago

It will depend on the disease under study - many are age-related / confounded by age, including cancer.

Regarding the specifics of the difference between 60 and 61, it's simply a cut-off / threshold. One could just as easily make the same argument about p<0.05 and p<0.051

In other disease areas, 55 may be a single cut-off for age, as it is regarded as the age at which menopause commences, but this obviously differs from individual to individual.

Kevin

ADD COMMENT
0
Entering edit mode

Thank you. It seems I still have to study more.

ADD REPLY
2
Entering edit mode

Not sure that you need to study more, as such... in certain diseases, there are just well established relationships between age and disease. This is therefore more related to medicine and epidemiology, as opposed to being about bioinformatics.

What German says is correct, too, i.e., that using strict cut-offs and creating categorical variables from continuous variables can result in lost power.

There are no standards in how to deal with this.

ADD REPLY

Login before adding your answer.

Traffic: 2656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6