On Wed, 27 Jan 1999, Brian Cade wrote:
> area) that I want to relate to 4 independent variables. Using glm I
> fit both a linear least square regression, glm(surv~ x1 + x2 + x3 +
> x4,family="gaussian"), and a linear model with logit link and constant
> variance, glm(surv~ x1 + x2 + x3 + x4,family ="quasi"(link = "logit",
> variance="constant"). Residual plots suggested both models achieved
> similar homogeneity of residuals and there seemed to be little
> justification to model the variance function differently than constant
> (e.g., mu(1-mu) for logit was no improvement. I thought I might be
> able to use AIC to pick between these models, but then wondered what
> "AIC" computation to use as several alternatives have been offered.
> Below is a summary of deviances and AIC for the 2 models
>
> identity link logit link
>
>
> residual deviance 17.245 16.536
> df 125 125
>
> dispersion 0.138 0.132
> parameter
>
> AIC -252.601 -258.059
>
> "quasi" AIC 18.625 17.856
>
> Now as these really are both least squares regressions, one a linear
> model (y=b0 + b1x1+ b2x2 + ..)and the other a nonlinear model (y =
> exp(b0 + b1x1 + b2x2 + ...)/(1+exp(b0 + b1x1 + b2x2 + ...)) where the
> loglikelihood is for a normal error distribution, I would think to use
> an AIC based on -2loglikelihood + 2p, where -2loglikelihood =
> n*log(RSS/n), RSS = residual sums of squares = residual deviance, and
> p is number of parameters. But often in the glm literature related to
> quasi likelihoods where a dispersion parameter is estimated I see a
> quantity quasi AIC = deviance/estimate of dispersion parameter + 2p or
> in Venables and Ripley (1994) quasi AIC = deviance + 2p(estimate of
> dispersion). Venables and Ripley suggest the estimate of dispersion
> parameter should be held constant to compare models, but that would
> make it a constant that effectively could be removed from the formula.
> All three of these variants of AIC order the models the same (as does
Whooah. There is only one AIC, that defined by Akaike. (Trevor Hastie may
disagree, but the rules of priority are quite clear: Akaike's AIC has
precedence over his.) Your `quasi AIC' is Mallows' C_p, I believe, as we
say. AIC is only defined if you have a likelihood, and a "quasi" model
does not (almost by definition). Although for some reason you are quoting
an outdated edition of our book, I do not recognize what you say we say
with the actual page in front of me. To be blunt: step.glm got this wrong
(and the method used is dangerous): stepAIC in library MASS is much less
likely to mislead you. [WARNING: in 5.0r2 and r3 step() is even more
broken, and may give completely incorrect results.]
> just looking at deviance because number of model parameters is the
> same) but the scale of relative differences is different for the
> different variants of AIC. It is the relative differences of AIC that
> are supposed to be important when comparing models. What is the best
Yes, relative differences of AIC, not of perversions of it.
> way to compare glm's with different link functions and estimated
> dispersions (and possibly different number of parameters) using AIC?
> Or should this not be done?
Use AIC, if it exists. Beware though that the theoretical backing for
AIC-based model choice for non-nested models is pretty weak, adn in your
example I find it hard to see that there will be any practical difference
between the models. Statistical and practical significance can be
very different issues.
> (2)
> An interesting issue occurred with estimating the logit link model in
> 3 different packages, S-Plus 4.5, SAS, SYSTAT 7; different parameter
> estimates or standard errors. Results below. Note that deviances
> (RSS = 16.5365) and estimates of dispersion (MSE = 0.1323) were the
> same for all packages. Anyone have any comment on what S-Plus is
> doing differently than SAS and SYSTAT?
>
> S-Plus SAS Insight SYSTAT 7
> glm( family=quasi(link glm (quasi,link=logit, Nonlinear
> =logit,variance=const) (variance=normal, model
> deviance) (loss = RSS,
> Gauss-Newton)
>
> b0(se) -0.098 (6.707) -0.383 (8.295) -0.383 (6.767)
>
> b1(se) 0.002 (0.003) 0.002 (0.003) 0.002 (0.003)
>
> b2(se) -0.003 (0.002) -0.003 (0.002) -0.003 (0.002)
>
> b3(se) 0.557 (0.208) 0.567 (0.262) 0.567 (0.211)
>
> b4(se) -0.033 (0.009) -0.034 (0.011) -0.034 (0.009)
>
> SYSTAT nonlinear least squares parameter estimates are similar to
> those from SAS glm but SYSTAT standard errors are more similar to
> S-plus glm standard errors. Great discrepancy between SAS/SYSTAT b0
> estimate and that of S-Plus is disturbing.
Have these converged? Try changing the control tolerances to be sure. Often
the likelihood surface is very flat if you have highly related or
ineffective carriers, and the convergence criterion in glm() is rather lax.
Actually, I don't see that a difference of less than 5% of an s.e. should
disturb you.
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|