s-news
[Top] [All Lists]

Re: Bootstrapping Newbie question

To: s-news@lists.biostat.wustl.edu
Subject: Re: Bootstrapping Newbie question
From: "Donald Catanzaro, PhD" <dgcatanzaro@gmail.com>
Date: Tue, 19 Aug 2008 15:04:56 -0500
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:reply-to :user-agent:mime-version:to:subject:content-type :content-transfer-encoding; bh=yZcvTP0Epw6RRCmo2hxji6BgthSKodJ9mNyLyUrud5s=; b=Zqax1rHfLmmaFz8+nrPK6eN3sAi5lSp5YuK96x0dhnQyJqX/rqYVYMgKy7Eyzc0P2E VwvrFDg/yxgcDIA09gOdwbeN170CL79ap5rChcQ4A3tiNGKW0symg2l3iz6upd7z+RwR 6iSYTZOOKrAXa47APw7adWp30asPbV8BlHbdQ=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=AoQyf5X38xMDLJlufCvJddJznqr+bhXasjqKflnibQ4oGzP1/xLXuRozfN1yoeLF9d tbxB7mDKs2KPDEN+ygK88wW1FtYUG2Phc1OFqf/8DBhuIb9KeuoFUKUd8+Ixd87xTDP/ 3pOj2MxYNat1U/45Mhy6OB+k1QMNIWQeoMMtQ=
Reply-to: dgcatanzaro@gmail.com
User-agent: Thunderbird 2.0.0.16 (Windows/20080708)
Good Day All,

I am a newbie to SPLUS and had a question which I can't seem to figure out reading the documentation or searching the web.

My goal is to create a generalized linear model using the negative binomial distribution. I created that model with the full dataset (~500 sample points) and now that it is done, I wish to evaluate the robustness of the model. Conceptually what I would like to do is see how the coefficient estimates vary using the standard 80:20 rule. I create the model with 80% of the data, I would then take the 20% of the data I reserved to test the model, then I would repeat the procedure 1000 times or so.

Conceptually this is in essence a resampling technique and I thought I would be using the bootstrapping and/or jackknife methods in SPLUS.

So two calls such as :

l.nb <-glm.nb(formula = TC ~ TE + C + SQC + CINC + SQC.IN + offset(log(AP)), data = LD, na.action= na.omit, control = glm.control(maxit = 500))

and then

l.boot1 <- bootstrap(data=LD, statistic=coef(eval(l.nb$cal)), B=200)

would in turn create my model and then resample my data 200 times (I set it to 200 so it would not take to long). What I can't seem to figure out is how to get the 80/20 split nor retaining the 20% for model validation. I have limited understanding of the SPLUS bootstrap/jackknife function and the documentation leaves me puzzled so I can't nail down the syntax.

I know that this has been done (alot), am I on the right track here ? Could someone point me to some resources in SPLUS documentation and/or webpages that I could read up on before I ask more questions ?
Thanks in advance !

--

-Don
Don Catanzaro, PhD                  Landscape Ecologist
dgcatanzaro@gmail.com               16144 Sigmond Lane
479-751-3616                        Lowell, AR 72745


<Prev in Thread] Current Thread [Next in Thread>