I claim there is a bug in kaplanMeier. I tried to get in touch with MathSoft
in this respect, with no success, therefore I am sending the message to the
list asking for comments.
L. Molinari
================================
I have come across a questionable piece of code in the function kaplanMeier.
(S+2000 Rel2 under NT4.0)
But first compare the results of survfit and kaplanMeier:
-----------Call to survfit page 233 in S+ Guide to Statistics vol2
S+> summary(survfit(Surv(time,status) ~ group,leukemia))
Call: survfit(formula = Surv(time, status) ~ group, data = leukemia)
group=Maintained
time n.risk n.event survival std.err lower 95% CI upper 95% CI
9 11 1 0.909 0.0867 0.7541 1.000
13 10 1 0.818 0.1163 0.6192 1.000
18 8 1 0.716 0.1397 0.4884 1.000
23 7 1 0.614 0.1526 0.3769 0.999
31 5 1 0.491 0.1642 0.2549 0.946
34 4 1 0.368 0.1627 0.1549 0.875
48 2 1 0.184 0.1535 0.0359 0.944
group=Nonmaintained
time n.risk n.event survival std.err lower 95% CI upper 95% CI
5 12 2 0.8333 0.1076 0.6470 1.000
8 10 2 0.6667 0.1361 0.4468 0.995
12 8 1 0.5833 0.1423 0.3616 0.941
23 6 1 0.4861 0.1481 0.2675 0.883
27 5 1 0.3889 0.1470 0.1854 0.816
30 4 1 0.2917 0.1387 0.1148 0.741
33 3 1 0.1944 0.1219 0.0569 0.664
43 2 1 0.0972 0.0919 0.0153 0.620
45 1 1 0.0000 NA NA NA
---------------- Then call to kaplanMeier --------------------
S+> kaplanMeier(censor(time,status) ~ group,leukemia)
group=Maintained
Number Observed: 11
Number Censored: 4
Confidence Type: log
Survival Std.Err 95% LCL 95% UCL
(-Inf, 9] 1.000 0.000 1.000 1.000
( 9, 13] 0.909 0.087 0.767 1.000
( 13, 18] 0.818 0.116 0.651 1.000
( 18, 23] 0.716 0.140 0.544 0.941
( 23, 31] 0.614 0.153 0.455 0.828
( 31, 34] 0.491 0.164 0.356 0.677
( 34, 48] 0.368 0.163 0.268 0.506
( 48, 161] 0.184 0.153 0.136 0.249
group=Nonmaintained
Number Observed: 12
Number Censored: 1
Confidence Type: log
Survival Std.Err 95% LCL 95% UCL
(-Inf, 5] 1.000 0.000 1.000 1.000
( 5, 8] 0.833 0.108 0.675 1.000
( 8, 12] 0.667 0.136 0.511 0.870
( 12, 23] 0.583 0.142 0.441 0.771
( 23, 27] 0.486 0.148 0.364 0.650
( 27, 30] 0.389 0.147 0.292 0.519
( 30, 33] 0.292 0.139 0.222 0.383
( 33, 43] 0.194 0.122 0.153 0.247
( 43, 45] 0.097 0.092 0.081 0.116
( 45, Inf) 0.000 0.000 NA NA
-----------------------------------------------------------
The survival function AND the standard errors are identical, but LCL and UCL
are quite different, and much narrower for kaplanMeier, and this is true for
the other conf.type options (ident or plain and log-log) also.
The source of the difference is in the following piece of code in
kaplanMeier, which does not correspond to page 238-239 of the S+ Guide to
Statistics, Vol.2.
Here the code of kaplanMeier:
----------------------------------
if(conf.interval == "identity") {
temp1 <- surv + zval * se * surv
temp2 <- surv - zval * se * surv
fiti <- cbind(fiti, lower = pmax(temp2,
0), upper = pmin(
temp1, 1))
}
if(conf.interval == "log") {
xx <- ifelse(surv == 0, 1, surv)
#avoid some "log(0)" messages
temp1 <- ifelse(surv == 0, NA, exp(log(xx)
+ zval * se))
temp2 <- ifelse(surv == 0, NA, exp(log(xx)
- zval * se))
fiti <- cbind(fiti, lower = temp2, upper =
pmin(temp1, 1))
}
if(conf.interval == "log-log") {
who <- (surv == 0 | surv == 1)
#special cases
temp3 <- ifelse(surv == 0, NA, 1)
xx <- ifelse(who, 0.1, surv) #avoid some
"log(0)" messages
temp1 <- exp( - exp(log( - log(xx)) +
(zval * se)/log(xx)))
temp1 <- ifelse(who, temp3, temp1)
temp2 <- exp( - exp(log( - log(xx)) -
(zval * se)/log(xx)))
temp2 <- ifelse(who, temp3, temp2)
fiti <- cbind(fiti, lower = temp2, upper =
temp1)
}
-------------------------------
For example, the formula at the bottom of page 238, would imply the
following code
temp1 <- surv + zval * se
rather then the one given above.
Either the formulas of page 238-239 are incorrect or the above code.
Notice that for the survfit function the confidence interval is calculated
in survfit.km, but the code is only apparently identical with the above
code, because the variable std.err is different! In fact the final std.err
is calculated in summary.survfit by multiplying it by the survival function.
All this is not very transparent and easily leads to misunderstandings.
It is easy to modify kaplanMeier to give the same results as survfit, but
before I do that I would like to have your opinion.
Best regards,
Luciano Molinari
-----------------------
Dr. Luciano Molinari
Abt. EEG und AWE
Kinderspital Zürich
Steinwiesstr.75
CH-8032 Zürich
luciano.molinari@kispi.unizh.ch
Tel. (00411) 266 79 25 oder (00411) 266 75 66
Fax (00411) 266 71 71
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|