s-news
[Top] [All Lists]

KaplanMeier CI computations?

To: "'s-news@wubios.wustl.edu'" <s-news@wubios.wustl.edu>, "Koyak, Robert" <rakoyak@nps.navy.mil>
Subject: KaplanMeier CI computations?
From: "Buttrey, Samuel" <sebuttre@nps.navy.mil>
Date: Mon, 26 Feb 2001 15:37:22 -0800
Cc: "'bugs@insightful.com'" <bugs@insightful.com>
I'm posting this for a colleague who has found what he believes to be a bug
in kaplanMeier(). Any thoughts? Insightful?
----------------------------------------------------
The problem concerns the upper and lower confidence limits, which are
generally too narrow at the high end of the x-scale.  Using the default
conf.interval = 'log' option, these intervals are supposed to be based on
the logarithm of the survivor function, which when exponentiated back
produces a confidence interval for the survivor function itself.  For
instance,

S(x) * exp(c(-1,1)*pnorm(.975)*se.log.S(x))

gives an approximate 95% CI for the true survivor function, for which S(x)
is the Kaplan-Meier estimator.  The quantity se.log.S(x) is the estimated
standard error of log(S(x)), using Greenwood's formula.  Greenwood's formula
is also used to estimate the standard error of the K-M estimator:  se.S(x) =
S(x)*se.log.S(x), which is based on a delta-method argument.

THE ERROR.  The kaplanMeier output gives se.S(x), which is fine, but it
reports the 95% confidence interval as

S(x) * exp(c(-1,1)*pnorm(.975)*se.S(x))

EXAMPLE: Use the built-in leukemia data set:

leuklist <- kaplanMeier(censor(time,status) ~ group, data = leukemia)
leuklist[[1]][[1]] # shows results for the first group

      time1 time2  survival    std.err     lower     upper 
1 -1.0e+029     9 1.0000000 0.00000000 1.0000000 1.0000000
2  9.0e+000    13 0.9090909 0.08667841 0.7670550 1.0000000
3  1.3e+001    18 0.8181818 0.11629130 0.6514221 1.0000000
4  1.8e+001    23 0.7159091 0.13966496 0.5444711 0.9413278
5  2.3e+001    31 0.6136363 0.15263231 0.4549778 0.8276218
6  3.1e+001    34 0.4909090 0.16419324 0.3558275 0.6772711
7  3.4e+001    48 0.3681818 0.16266887 0.2676691 0.5064380
8  4.8e+001   161 0.1840909 0.15349272 0.1362633 0.2487055
attr(, "n.censored"):
[1] 4
attr(, "n.observed"):
[1] 11
attr(, "conf.interval"):
[1] "log"
attr(, "coverage"):
[1] 0.95

Now, let's verify the last row of the matrix:

> .1840909 * exp(c(-1,1)*qnorm(.975)*.15349272)
[1] 0.1362634 0.2487056

The problem is that this is wrong.  Here, std.err is Greenwood's formula,
not for log(S(x)), but for S(x).

-------------------------------------------------------------

Bob Koyak (via Sam Buttrey)
Assistant Professor
Operations Research Dept.
Naval Postgraduate School
rakoyak@nps.navy.mil

<Prev in Thread] Current Thread [Next in Thread>