s-news
[Top] [All Lists]

Re: Simulation of survival data to produce sample size estimate.

To: John Sorkin <jsorkin@grecc.umaryland.edu>
Subject: Re: Simulation of survival data to produce sample size estimate.
From: Spencer Graves <spencer.graves@PDF.COM>
Date: Mon, 16 Jun 2003 17:59:52 -0700
Cc: s-news@lists.biostat.wustl.edu
References: <seee0ed8.003@medicine.umaryland.edu>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02
The Survival function S(t) = Pr(T>t) = exp(-H(t)) where H(t) = cumulative hazard = integral of the hazard rate h(t.). If we can get h(t), then we can get H(t), and then we can generate n pseudo-random numbers by inverting S(runif(n)). Since S(t) = exp(-H(t)), we get H(t) = (-log(runif(n)). Now all we need is H(t) and its inverse function.

To find H(t), we work from the facts you gave: Namely that h(50) = 0.01, and h(t+8) = 2*h(t). Thus, for t = 50+8*k, we get h(50+8*k) = 0.01*2^k. Substituting now k = (t-50)/8, we get therefore

          h(t) = 0.01*exp(log(2)*(t-50)/8) = h0*exp(a*t)

where a = log(2)/8 and h0 = 0.01*exp(-50*a).

From this, we get H(t) by integrating, namely H(t) = H0 + (h0/a)*exp(a*t), where H0 = (-h0/a)*exp(a*t0), t0 = the youngest age you want in your pseudo-random numbers. This choice of H0 makes H(t) = 0 at t0, which makes S(t0) = 1. If you won't accept anyone younger than 50 in your study, then t0=50.

Now, to get the pseudo-random numbers you want, we further apply the inverse of H(t) to (-log(runif(n)). I wrapped this all into functions for this distribution following the standard convention with initial letters = dpqr = density, probability, quantile, and random numbers plus h for hazard; see below. I wrote the whole suite, because doing so helps me think through this kind of thing. I didn't debug them all, but the radult.human seemed to produce numbers that looked plausible.

hope this helps.  spencer graves

# Human hazard = 0.01 at age 50,
# and doubles every 8 years

hadult.human <-
function(x, t0=0, cumulative=TRUE, log.=FALSE){
#       (cumulative) hazard function
# for the "adult human" distribution,
# which is 0.01 at age 50
# and doubles every 8 years thereafter
        a <- log(2)/8
        h0 <- 0.01*exp(-50*a)
        hazard <- h0*exp(a*x)
        if(cumulative)
                hazard <- ((hazard - h0*exp(a*t0))/a)
        if(log.) log(hazard) else hazard
}

padult.hazard <-
function(q, t0=0, lower.tail=TRUE){
# probability function (cum dist'n f'n)
# for the "adult human" distribution,
# which is 0.01 at age 50
# and doubles every 8 years thereafter
        S.t <- exp(-hadult.human(q, t0=t0))
        if(lower.tail) (1-S.t)
        else S.t
}

qadult.hazard <-
function(p, t0=0){
# quantile function = inverse cum dist'n f'n
# for the "adult human" distribution,
# which is 0.01 at age 50
# and doubles every 8 years thereafter
        Haz <- (-log(1-p))
        a <- log(2)/8
        h0 <- 0.01*exp(-50*a)
#  Cum hazard = (h0/a)*(exp(a*t)-exp(a*t0))
#  so 1 = S(t0) = exp(-Cum.hazard)
        e.at <- (a/h0)*Haz+exp(a*t0) 
        log(e.at)/a
}

radult.hazard <-
function(n, t0=0){
        qadult.hazard(runif(n), t0=t0)  
}

dadult.hazard <-
function(x, t0){
        hadult.hazard(x=x, t0=t0)*padult.hazard 
}



        

John Sorkin wrote:
I am planning a longitudinal follow-up of a human population. In the
past when I needed a sample size estimate for a survival study I have
used a parametric approach to the sample size estimation, i.e. I used an
exponential survival function to estimate the sample size. I recognize
that the exponential function is not a good estimate for human survival
in that the exponential distribution is memory less. In order to obtain
a more realistic sample size estimate, I would like to simulate the
survival distribution. It is know that mortality doubles approximately
every eight years in adult human populations and that mortality is
approximately 1% at age 50. Unfortunately I have no Idea how to set up
the simulation. I hope some kind soul will be able, and willing to help.
Thanks

John Sorkin
John Sorkin MD, PhD
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
410-605-7119 john@grecc.umaryland.edu
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news



<Prev in Thread] Current Thread [Next in Thread>