s-news
[Top] [All Lists]

Help to speed up a for loop (leave-one-out procedure),

To: <s-news@lists.biostat.wustl.edu>
Subject: Help to speed up a for loop (leave-one-out procedure),
From: "Javier Seoane" <seoane@ebd.csic.es>
Date: Mon, 20 May 2002 11:57:52 +0200
Dear S-Plus users,

I am trying to implement a "leave-one-out" procedure for the predictions of a 
model and I managed to do that... but in a very uneffectively way (details 
below). To be specific: I would like to (1) parameterize a model with every 
observation except one (the model was specified previously), (2) make the 
corresponding prediction for this one observation left out, (3) repeat until I 
have a vector of predictions for every observation. There is a further 
complexity: I have a large amount of previously specified models (about 70), so 
I would want to repeat the steps above for each one of these models. I have 
windows 2000, 256 Mb RAM and my clumsy code is:

> version

S-PLUS 2000 Professional Edition Release 2 for Microsoft Windows : 1999 

First I set a vector with the number of previously specified models: 

lista <- c("aegcau.escala150.b2", "siteur.escala150.b2", "trotro.escala150.b2")

Then, a for inside a while loop:

comienzo <- proc.time()

predicciones2 <- 1:nrow(tmpESCALA150)         # tmpESCALA150 has the response 
(n=1144) and predictors

predicciones <- numeric(nrow(tmpESCALA150))          # set up an empty vector 
for predictions

n <- 1

while (n < length(lista)+1) {                    # to visit every element of 
"lista" 

for (i in 1:nrow(tmpESCALA150)) {     # to visit every element of original 
database (tmpESCALA150)

predicciones[i] <- 

predict.gam(

gam(eval(parse(text=(paste(as.character(lista[n]), "$formula")))), 

data=tmpESCALA150[-i,], family=binomial),         # model$formula is previously 
specified

tmpESCALA150[i,], type="response")                 # predict response for the i 
observation left out

}

predicciones2 <- cbind(predicciones2, predicciones) 

                # clumsy, but I finally need a data set with the predictions 
("predicciones2")

n <- n+1

}

predicciones2


tiempo <- proc.time()-comienzo

tiempo/60 

The loop takes 254 min to complete with only three specified models (in 
"lista"). A loop for a single model takes 45  min. I am afraid that looping 
through 70 models would be unaffordably slow. 

Any suggestions to speed up the loop would be appreciated,

Javier Seoane
Department of Applied Biology
Estación Biológica de Doñana, CSIC
Avda. María Luisa s/n
41013, Sevilla
SPAIN







<Prev in Thread] Current Thread [Next in Thread>
  • Help to speed up a for loop (leave-one-out procedure),, Javier Seoane <=