I'm posting this for a colleague who has found what he believes to be a bug
in kaplanMeier(). Any thoughts? Insightful?
----------------------------------------------------
The problem concerns the upper and lower confidence limits, which are
generally too narrow at the high end of the x-scale. Using the default
conf.interval = 'log' option, these intervals are supposed to be based on
the logarithm of the survivor function, which when exponentiated back
produces a confidence interval for the survivor function itself. For
instance,
S(x) * exp(c(-1,1)*pnorm(.975)*se.log.S(x))
gives an approximate 95% CI for the true survivor function, for which S(x)
is the Kaplan-Meier estimator. The quantity se.log.S(x) is the estimated
standard error of log(S(x)), using Greenwood's formula. Greenwood's formula
is also used to estimate the standard error of the K-M estimator: se.S(x) =
S(x)*se.log.S(x), which is based on a delta-method argument.
THE ERROR. The kaplanMeier output gives se.S(x), which is fine, but it
reports the 95% confidence interval as
S(x) * exp(c(-1,1)*pnorm(.975)*se.S(x))
EXAMPLE: Use the built-in leukemia data set:
leuklist <- kaplanMeier(censor(time,status) ~ group, data = leukemia)
leuklist[[1]][[1]] # shows results for the first group
time1 time2 survival std.err lower upper
1 -1.0e+029 9 1.0000000 0.00000000 1.0000000 1.0000000
2 9.0e+000 13 0.9090909 0.08667841 0.7670550 1.0000000
3 1.3e+001 18 0.8181818 0.11629130 0.6514221 1.0000000
4 1.8e+001 23 0.7159091 0.13966496 0.5444711 0.9413278
5 2.3e+001 31 0.6136363 0.15263231 0.4549778 0.8276218
6 3.1e+001 34 0.4909090 0.16419324 0.3558275 0.6772711
7 3.4e+001 48 0.3681818 0.16266887 0.2676691 0.5064380
8 4.8e+001 161 0.1840909 0.15349272 0.1362633 0.2487055
attr(, "n.censored"):
[1] 4
attr(, "n.observed"):
[1] 11
attr(, "conf.interval"):
[1] "log"
attr(, "coverage"):
[1] 0.95
Now, let's verify the last row of the matrix:
> .1840909 * exp(c(-1,1)*qnorm(.975)*.15349272)
[1] 0.1362634 0.2487056
The problem is that this is wrong. Here, std.err is Greenwood's formula,
not for log(S(x)), but for S(x).
-------------------------------------------------------------
Bob Koyak (via Sam Buttrey)
Assistant Professor
Operations Research Dept.
Naval Postgraduate School
rakoyak@nps.navy.mil
|