version
Version 5.1 Release 1 for Sun SPARC, SunOS 5.5 : 1999
I have found that when survfit() and kaplanMeier() are used to get
the Kaplan-Meier survival curve, they produce (for the same data, of
course)
- the same estimates of survival, with
- the same standard errors (as reported by summary.survfit()), but
- different size confidence intervals
The difference boils down to this:
survfit() returns a value labeled "std.err"
summary.survfit() returns a value labeled "std.err"
they are different, and
std.err[summary.survfit] = std.err[survfit] * survival.est
The confidence limits returned by summary.survfit() use the standard
errors _without_ multiplying by estimated survival.
The confidence limits returned by kaplanMeier() use the standard
errors _with_ multiplying by estimated survival (although that
multiplication appears to take place inside the Fortran subroutine
that calculates the fit).
Seems to me they can't both be right.
Is this a bug in one of them?
Or is the some other reason?
Thanks
-Don
##
## Example
##
df <- data.frame(time=1:10,event=rep(1,10))
sf <- survfit(Surv(time,event)~1,data=df,type='kaplan-meier')
sfs <- summary(sf)
km <- kaplanMeier(censor(time,event)~1,data=df)
zval <- qnorm(1-.025)
lcl.sf <- exp( log(sf$surv) + zval*sf$std.err )
lcl.sf[lcl.sf>1] <- 1
lcl.sfs <- exp( log(sf$surv) + zval*sfs$std.err )
lcl.sfs[lcl.sfs>1] <- 1
print(km)
print(sfs)
print(sf$std.err)
print(sfs$std.err)
print(sf$std.err * sf$surv)
print(lcl.sf)
print(lcl.sfs)
--
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
--------------------------------------
|