Dear S-Plus users,
I am trying to implement a "leave-one-out" procedure for the predictions of a
model and I managed to do that... but in a very uneffectively way (details
below). To be specific: I would like to (1) parameterize a model with every
observation except one (the model was specified previously), (2) make the
corresponding prediction for this one observation left out, (3) repeat until I
have a vector of predictions for every observation. There is a further
complexity: I have a large amount of previously specified models (about 70), so
I would want to repeat the steps above for each one of these models. I have
windows 2000, 256 Mb RAM and my clumsy code is:
> version
S-PLUS 2000 Professional Edition Release 2 for Microsoft Windows : 1999
First I set a vector with the number of previously specified models:
lista <- c("aegcau.escala150.b2", "siteur.escala150.b2", "trotro.escala150.b2")
Then, a for inside a while loop:
comienzo <- proc.time()
predicciones2 <- 1:nrow(tmpESCALA150) # tmpESCALA150 has the response
(n=1144) and predictors
predicciones <- numeric(nrow(tmpESCALA150)) # set up an empty vector
for predictions
n <- 1
while (n < length(lista)+1) { # to visit every element of
"lista"
for (i in 1:nrow(tmpESCALA150)) { # to visit every element of original
database (tmpESCALA150)
predicciones[i] <-
predict.gam(
gam(eval(parse(text=(paste(as.character(lista[n]), "$formula")))),
data=tmpESCALA150[-i,], family=binomial), # model$formula is previously
specified
tmpESCALA150[i,], type="response") # predict response for the i
observation left out
}
predicciones2 <- cbind(predicciones2, predicciones)
# clumsy, but I finally need a data set with the predictions
("predicciones2")
n <- n+1
}
predicciones2
tiempo <- proc.time()-comienzo
tiempo/60
The loop takes 254 min to complete with only three specified models (in
"lista"). A loop for a single model takes 45 min. I am afraid that looping
through 70 models would be unaffordably slow.
Any suggestions to speed up the loop would be appreciated,
Javier Seoane
Department of Applied Biology
Estación Biológica de Doñana, CSIC
Avda. María Luisa s/n
41013, Sevilla
SPAIN
|