Dear S+ users,
Sorry to take so long in summarizing the responses to my question (I'm still
trying to get a handle on all of it). First, a public thank you for
everyone who helped on this:
Alan Zaslavsky
Douglas Bates
Renaud Lancelot
Joseph Lucke
Frank Lawrence
Sama LowChoy
Pat Sullivan
Anne York
Rather than posting all the e-mails that have been flying back and forth on
this, I'll try and summarize (which means if I make a mistake on something,
it's most likely my fault in trying to get a handle on all this). First
thing to note...never actually use "_" in a label. I was just doing that
here because I think they're easier to read than using ".". However, if you
use a "_" as a label in Splus, bad things will happen. My original question
is posted below.
If I am trying only to estimate the success rates in the units that are
contained in my dataset and am not concerned with serial correlation, I
could just use a glm() and treat Unit as a 20-level factor:
glm(Success~Unit + Road_Density + Season_Length, family = binomial(link =
logit),..... )
If I am trying to estimate an equation for the success rates for units in my
data set *and* outside my dataset, then I'd would use a mixed-effects model:
lme(fixed = Success ~ Road_Density + Season_Length, random = Season_Length |
Unit, data = my_data_file)
A number of people pointed out that Road_Density can not have a random
effect because there is no variability within the Units (I'm kicking myself
for having forgotten that). The above equation will produce a mean success
rate for all units, one parameter that incorporates the effect of
Road_Density, one parameter the incorporates the effect of Season_Length,
and 40 Best Linear Unbiased Predictors (BLUPs)(one intercept predictor for
each of my 20 units and one slope predictor relating to Season_Length for
each of my 20 units). If I want to predict success rates in a unit that was
not in my dataset, I use the mean and two parameters. If I want to predict
success rates for a unit that was in my data set, I use the mean, the two
parameters, and that unit's BLUPs.
However, mixed-effects models do not necessarily take into account all the
residual serial correlation. Therefore, I should also look at adding the
"corr = corAR1()" to my lme() statement to test for more complicated
correlation structure.
The way one responder put it... I'm caught between a fixed-effects GLM to
handle my non-Gaussain Success variable, and a Gaussian random effects model
with a transformed success rate. One solution to this is to use generalized
mixed-effects models in SAS (supposedly a GLIMMIX macro) or a program such
as MLwiN (www.ioe.ac.uk/mlwin). I was also given the reference:"Comparisons
of Software Packages for Generalized Linear Multilevel Models" by X Zhou, AJ
Perkins, and SL Hui in The American Statistician, August 1999, 53(3),
282-290. Another solution, if I still want to go with S+, is to use
Generalized Estimating Equations (GEEs). There are 2 S+ GEE libraries: gee
and yags (by the same author), available at
biosun1.harvard.edu/~carey/index.ssoft.html.
Thanks for everyone's help on this!!! If anyone else has any comments (or
corrections to what I've stated here), please feel free to contact me.
Thanks again!
-- Andy
***********************************************************
Andy Cooper
Fisheries Conservation Biologist
National Audubon Society, Living Oceans Program
550 South Bay Ave.
Islip, New York 11751
(631) 859-1588
FAX: (631) 581-5268
acooper@audubon.org
> -----Original Message-----
> From: COOPER, Andy
> Sent: Thursday, January 06, 2000 3:27 PM
> To: S-News
> Subject: [S] Mixed Effects model clarification
>
>
>
>
> Dear S+ users,
>
> Please forgive me if this seems like a very basic question,
> but this is my
> first look at mixed-effect models. I've searched all my
> limited resources
> (S+ manuals, Venables and Ripley (1999), and the StatLib
> archives), and
> can't quite find the answer to my question. I'm using S-plus
> 2000 Release 2
> on a PC running Windows95. Basically, I'm trying to figure
> out if S+ is
> doing what I think/hope it's doing, and that it's actually
> the correct thing
> to do.
>
> I'm trying to estimate an equation to predict the success rate of elk
> hunters in Idaho. The state is broken up into about 20
> units, and I have
> data for each of these units for 6 years. My dependent
> variable is the
> success rate of hunters in each of the units in each year.
> Some of the
> independent variables that may be related to the success rate
> change across
> the years and differ between units (e.g. length of the
> hunting season that
> year, what weapons were allowed that year, an estimate of the
> elk population
> size that year, etc). There are other independent variables
> that describe
> the physical features of each unit that do not change over
> time (at least at
> the scale which I'm able to measure them), such as road
> density, percent
> area that is forested, etc. As such, my data looks a bit like
> this (but with
> more years, more units, and more variables...these are
> made-up numbers):
>
> Unit Year Success Season_Length Elk_Population_Size
> Road_Density
> 1 1990 0.2 100 1000 0.05
> 1 1991 0.15 120 1020 0.05
> 1 1992 0.18 110 900 0.05
> 2 1990 0.1 60 200 0.02
> 2 1991 0.09 60 250 0.02
> 2 1992 0.15 80 220 0.02
>
> What I am trying to do is determine which variables should be
> included, and
> estimate the parameters for the equations :
>
> S(i,j) = b0 + b1(j) + b2*Season_Length(i,j) +
> b3*Pop_Size(i,j)... (eq 1)
> b1(j) = a0(j) + a1*Road_Density(j) +
> a2*Percent_Forested_Area(j) + .... (eq
> 2)
>
> Where S(i,j) is the properly transformed success rate in year
> i for unit j.
> Basically, the variable that allows each unit to have a
> separate intercept,
> b1(j), is a function of the attributes of the unit (which do
> not change over
> time).
>
> To test this, what I think I would use would be the lme()
> equation as such:
>
> lme(Success ~ Season_Length, random = ~1|Unit, data = my_data_file)
>
> This particular equation would allow me to test the significance of
> Season_Length, and produce estimates for the parameters b0,
> a0(j) and b2.
> This makes sense to me, and I think it's what I'm trying to
> do and supposed
> to do (But I could easily be wrong).
>
> Where I get confused is when I'm trying to test for the
> significance of
> something like Road_Density which is a continuous-valued,
> constant attribute
> of the Unit. From what I've been able to decipher so far, I
> should use the
> command:
>
> lme(Success ~ Road_Density, random = ~Road_Density|Unit, data =
> my_data_file)
>
> if I want to allow the parameter for Road_Density to vary by
> Unit, and the
> command:
>
> lme(Success ~ Road_Density, random = ~1|Unit, data = my_data_file)
>
> to test whether the parameter is the same for all Units. What
> I can't quite
> figure out is how S+ takes into account the fact that Road_Density is
> constant over time, or whether it explicitly needs to. For
> instance if this
> were a repeated measures anova (using the aov() statement with
> "+Error(Unit)" included), there are separate tests for within
> Units and
> between Units, each with the appropriate degrees of freedom. Does the
> mixed-effect model simply take both of those into account and
> produce an
> over-all significance test?
>
> Is this making any sense? Am I interpreting all this properly? Any
> thoughts on whether this is really what I should be doing?
>
> Thank you for your help!!!!
>
> Sincerely,
> Andy
>
> ***********************************************************
> Andy Cooper
> Fisheries Conservation Biologist
> National Audubon Society, Living Oceans Program
> 550 South Bay Ave.
> Islip, New York 11751
> (631) 859-1588
> FAX: (631) 581-5268
> acooper@audubon.org
>
> --------------------------------------------------------------
> ---------
> This message was distributed by s-news@wubios.wustl.edu. To
> unsubscribe
> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
> message: unsubscribe s-news
>
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|