Greetings to you all,
I am looking for advice on how to select the
‘best’ of multiple models.
I have trawled the archives, but have not had any joy.
I am using S-plus 6.2 with Windows XP and looking at species
presence absence data.
I have used GLMs and GAMs to investigate which environmental
variables are influencing distribution.
I am also comparing whether a simple or more detailed
classification of habitat
(1 of the predictor variables) provides better model
fits.
For each species, I have 4 models,
- 2 GLMs (1 using simple habitat
classification and 1 using detailed classification) and
- 2 GAMs (simple habitat and
detailed habitat).
Each final model is selected using backward stepwise, with
AIC for variable selection.
While I understand that D squared (1 – residual
deviance/null deviance)
should not be used to compare models based on binomial
distribution
(because it does not follow the Chi squared distribution), I
have taken the view that
these 4 models are nested and so I have calculated D
squared and Adjusted D squared values
for them and have been using this to compare the 4
models – please correct me on this if it is not the case.
At present I am just eye balling these values to see
which model explains more deviation,
but was hoping that someone could suggest a more rigorous
way of determining which model is best?
Any thoughts or suggested reading would be appreciated.
Look forward to comments.
Brenton
Brenton Chatfield
PhD Candidate
UWA
& Coastal CRC
School of Earth and Geographical Sciences
University of Western Australia (M004)
35 Stirling Highway
Crawley WA 6009
Email: chatfb01@student.uwa.edu.au
Telephone: +61 8
6488 4235
Fax:
+61 8 6488 1037