s-news
[Top] [All Lists]

Cross validation of glmmPQL

To: <s-news@lists.biostat.wustl.edu>
Subject: Cross validation of glmmPQL
From: Ejrnæs, Rasmus <rej@dmu.dk>
Date: Fri, 2 Feb 2007 15:33:45 +0100
Cc: "Hans Henrik Bruun" <Hans_Henrik.Bruun@ekol.lu.se>
Thread-index: AcdG1yOs5rB6ouoIQPek+37ER0RGVQ==
Thread-topic: Cross validation of glmmPQL

Dear S-plussers

I am working in S-Plus 2000 on Windows XP, and also have R 2.4.1 installed on the same machine.

Me and my colleagues have a data set of germinated seeds of 14 species in 20 plots along a gradient in species composition (and productivity) in subarctic vegetation. 30 seeds of each species were assigned to disturbed and undisturbed vegetation in each plot.

Data are poisson-distributed (and zero-inflated).

My approach was to analyse number of germinated seeds in a glmmPQL with family=poisson. Disturbance, species identity and vegetation score along the primary ordination axes are fixed effects and plot is random effect.

I have seen in previous discussions in S-Plus and R groups that the anova tool for model selection is inappropriate with a non-normal family, so I gathered that I could try using a spearman rank correlation between cross-validated predictions from the model and observed germination as an alternative model selection criterion.

I have previously used the following cross-validation function (kindly provided years ago by Professor Ripley) for different kind of models, but I cannot figure out how to adjust the function to take into account that mixed-effects model take two formulas, one for fixed and one for random effects. Here is the function I have used for 20-fold cross validation on a glm model of this data set:

CVtest <-
function(form, rand)
{
        res <- recruit$recruitm
        for(i in sort(unique(rand))) {
                cat("fold ", i, "\n", sep = "")
                assign("i", i, f=1)
                learn <- glm(form, data = "" subset = rand != i, family=poisson)
                res[rand == i] <- predict(learn, newdata = recruit[rand == i,  ], type = "response")
        }
        res
}

rand <- sample(1:20,560, replace=T)
res1 <- CVtest(recruitm~species+disturb+nms1, rand)
cor.test(res1, recruit$recruitm, method = ”spearman”)


Any help for updating the CVtest function to glmmPQL or alternative suggestions for model selection strategy with glmmPQL models would be greatly appreciated.

Regards,

Rasmus Ejrnaes
Dept. of wildlife ecology and biodiversity
Environmental Research Institute
University of Aarhus

<Prev in Thread] Current Thread [Next in Thread>
  • Cross validation of glmmPQL, Ejrnæs, Rasmus <=