The Survival function S(t) = Pr(T>t) = exp(-H(t)) where H(t) =
cumulative hazard = integral of the hazard rate h(t.). If we can get
h(t), then we can get H(t), and then we can generate n pseudo-random
numbers by inverting S(runif(n)). Since S(t) = exp(-H(t)), we get H(t)
= (-log(runif(n)). Now all we need is H(t) and its inverse function.
To find H(t), we work from the facts you gave: Namely that h(50) =
0.01, and h(t+8) = 2*h(t). Thus, for t = 50+8*k, we get h(50+8*k) =
0.01*2^k. Substituting now k = (t-50)/8, we get therefore
h(t) = 0.01*exp(log(2)*(t-50)/8) = h0*exp(a*t)
where a = log(2)/8 and h0 = 0.01*exp(-50*a).
From this, we get H(t) by integrating, namely H(t) = H0 +
(h0/a)*exp(a*t), where H0 = (-h0/a)*exp(a*t0), t0 = the youngest age you
want in your pseudo-random numbers. This choice of H0 makes H(t) = 0 at
t0, which makes S(t0) = 1. If you won't accept anyone younger than 50
in your study, then t0=50.
Now, to get the pseudo-random numbers you want, we further apply the
inverse of H(t) to (-log(runif(n)). I wrapped this all into functions
for this distribution following the standard convention with initial
letters = dpqr = density, probability, quantile, and random numbers plus
h for hazard; see below. I wrote the whole suite, because doing so
helps me think through this kind of thing. I didn't debug them all, but
the radult.human seemed to produce numbers that looked plausible.
hope this helps. spencer graves
# Human hazard = 0.01 at age 50,
# and doubles every 8 years
hadult.human <-
function(x, t0=0, cumulative=TRUE, log.=FALSE){
# (cumulative) hazard function
# for the "adult human" distribution,
# which is 0.01 at age 50
# and doubles every 8 years thereafter
a <- log(2)/8
h0 <- 0.01*exp(-50*a)
hazard <- h0*exp(a*x)
if(cumulative)
hazard <- ((hazard - h0*exp(a*t0))/a)
if(log.) log(hazard) else hazard
}
padult.hazard <-
function(q, t0=0, lower.tail=TRUE){
# probability function (cum dist'n f'n)
# for the "adult human" distribution,
# which is 0.01 at age 50
# and doubles every 8 years thereafter
S.t <- exp(-hadult.human(q, t0=t0))
if(lower.tail) (1-S.t)
else S.t
}
qadult.hazard <-
function(p, t0=0){
# quantile function = inverse cum dist'n f'n
# for the "adult human" distribution,
# which is 0.01 at age 50
# and doubles every 8 years thereafter
Haz <- (-log(1-p))
a <- log(2)/8
h0 <- 0.01*exp(-50*a)
# Cum hazard = (h0/a)*(exp(a*t)-exp(a*t0))
# so 1 = S(t0) = exp(-Cum.hazard)
e.at <- (a/h0)*Haz+exp(a*t0)
log(e.at)/a
}
radult.hazard <-
function(n, t0=0){
qadult.hazard(runif(n), t0=t0)
}
dadult.hazard <-
function(x, t0){
hadult.hazard(x=x, t0=t0)*padult.hazard
}
John Sorkin wrote:
I am planning a longitudinal follow-up of a human population. In the
past when I needed a sample size estimate for a survival study I have
used a parametric approach to the sample size estimation, i.e. I used an
exponential survival function to estimate the sample size. I recognize
that the exponential function is not a good estimate for human survival
in that the exponential distribution is memory less. In order to obtain
a more realistic sample size estimate, I would like to simulate the
survival distribution. It is know that mortality doubles approximately
every eight years in adult human populations and that mortality is
approximately 1% at age 50. Unfortunately I have no Idea how to set up
the simulation. I hope some kind soul will be able, and willing to help.
Thanks
John Sorkin
John Sorkin MD, PhD
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
410-605-7119
john@grecc.umaryland.edu
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
|