There are dozens of definitions for an R-square for a Cox model. I really
liked the following paper
Korn, Edward L. , and Simon, Richard (1990), ``Measures of explained
variation for survival data'', Statistics in Medicine, 9 , 487-503
which highlights, for me, one of the main issues. Say that we are following
patients with advanced lung cancer (a median survival of < 2 years), a
model has predicted 10 year survival for patient X, and the patient actually
lived 20 years. Any physician would say that this was perfect prediction,
and most R-square statistics would say that it was very bad -- worse than
telling a 3 year survivor that he had only 6 months.
What exactly R-squared SHOULD be for survival studies is a hard question.
What is used in coxph is a moderately good measure proposed by Nagelkirke,
that also has the virtue of being easy to compute.
Michael Schemper has given, I think, the most thought to the issues and
has published several papers. (He also points out that Nagelkirke's idea
appears earlier due to someone else -- I forget who). To dig deeper I would
recommend looking at his work.
Terry Therneau
|