s-news
[Top] [All Lists]

AIC model selection and model averaging

To: <s-news@lists.biostat.wustl.edu>
Subject: AIC model selection and model averaging
From: "Huso, Manuela" <manuela.huso@oregonstate.edu>
Date: Mon, 21 Jul 2003 11:28:37 -0700
Thread-index: AcNKF1sHXgzFwPcLRwemhQdHX7LpHAFkQ+ig
Thread-topic: [S] model averaging and all- subsets glm's
Hello, all,

I am a statistician whose job it is to consult with researchers in natrual 
resources, primarily forestry and wildlife, about study design and analysis.  
Burnham and Anderson's book entitled 'Model Selection and Inference: a 
Practical Information-Theoretic Approach' has caused quite a stir, particulary 
in the wildlife community and I have people wanting to apply the technique in 
every possible situation. 

I understand that Dr. Ripley has urged extreme caution in following B&A's 
guidelines.  I am writing to ask for some specific points of criticism and/or 
suggestions of literature that I might read to be able to form an educated 
opinion of where their techniques can/should be applied, where they shouldn't 
and how to know the difference.

I am also particularly interested in model averaging concepts and their 
advantages and limitations in both the AIC and BIC context.

Many thanks for your help and I hope I don't start a flood :-)

Manuela
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Manuela Huso
Consulting Statistician
201H Richardson Hall
Department of Forest Science
Oregon State University
Corvallis, OR   97331-5752
ph: 541-737-6232
fx: 541-737-1393


-----Original Message-----
From: Spencer Graves [mailto:spencer.graves@PDF.COM]
Sent: Monday, July 14, 2003 7:48 AM
To: Mary Wisz
Cc: s-news@lists.biostat.wustl.edu
Subject: Re: [S] model averaging and all- subsets glm's


...
          2.  On 6/25/2003, Brian Ripley expressed concern about Burnham and 
Anderson's book in a thread "logLik.lm()";  see below.  I'm currently a 
third of the way through reading Pattern Recognition and Neural 
Networks, recommended by Ripley below.  Using a full Bayesian approach 
(integrating out parameters, etc.) should be easy for "lm".  With 
something like "glm", this would be much harder, requiring, e.g., 
Hermite polynomial integration with saddle point approximations or 
Markov Chain Monte Carlo.

hope this helps.  spencer graves

 > Dear Prof. Ripley:
 >
 >       I gather you disagree with the observation in Burnham and Anderson
 > (2002, ch. 2) that the "complexity penalty" in the Akaike Information
 > Criterion is a bias correction, and with this correction, they can use
 > "density = exp(-AIC/2)" to compute approximate posterior probabilities
 > comparing even different distributions?

That's the derivation of BIC and similar, not AIC.

 >       They use this even to compare discrete and continuous 
distributions,
 > which makes no sense to me.  However, with a common dominating measure,
 > it seems sensible to me.  They cite a growing literature on "Bayesian
 > model averaging".  What I've seen of this claims that Bayesian model
 > averaging produces better predictions than predictions based on any
 > single model, even using these approximate posteriors ("Akaike weights")
 > in place of full Bayesian posteriors.
 >
 >       I don't have much experience with this, but so far, I seem to have
 > gotten great, informative answers to my clients' questions.  If there
 > are serious deficiencies with this kind of procedure, I'd like to know.

Yes, model averaging is useful, but is nothing to do with AIC nor Burnham
& Anderson.  See e.g. my PRNN book for better ways to do it.

Burnham & Anderson (2002) is a book I would recommend people NOT to read
until they have read the primary literature.  I see no evidence that the
authors have actually read Akaike's papers.


-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
##################################################################
Mary Wisz wrote:
> Does anyone have s-plus or r code to perform model averaging for GLM's
> (such as that advocated by Burnham and Anderson 1998, 2002)?
> Alternatively, does anyone have code to do all-subsets modelling in glm
> that reports parameter estimates, AICc, or at least Log-likelihood and
> number of parameters for  each model? By all-subsets modelling I mean an
> automated procedure that  builds glm models from all possible combinations
> of the predictors.
> 
> I am aware of concerns about using all-subsets modelling.
> 
> 
> Thank you!
> 
> All the best,
> Mary Wisz
> 
> -------------------------------------------------------------------------------
> Mary S. Wisz
> Conservation Biology Group **
> Department of Zoology
> Downing Street
> University of Cambridge
> Cambridge CB2 3EJ, England
> 
> msw31@cam.ac.uk
> 
> **This summer I can be reached at:
> Zoological Museum
> Vertebrate Department
> Universitetsparken 15
> DK-2100 Copenhagen 0
> Denmark
> 
>  Tel. Museum + 45 3532 1031
>       Hm  + 45 3916 1417
> 
> -------------------------------------------------------------------------------
> 
> 
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news


--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>