s-news
[Top] [All Lists]

Re: SUMMARY (II): ...speed up a for loop (leave-one-out procedure)

To: Javier Seoane <seoane@ebd.csic.es>
Subject: Re: SUMMARY (II): ...speed up a for loop (leave-one-out procedure)
From: Frank E Harrell Jr <fharrell@virginia.edu>
Date: Mon, 27 May 2002 08:58:59 -0400
Cc: s-news@lists.biostat.wustl.edu
In-reply-to: <002301c20551$92cdb5c0$91e46fa1@leda>
Organization: University of Virginia
References: <002301c20551$92cdb5c0$91e46fa1@leda>
You should be able to use lm.influence to get leave-out-one coefficients 
extremely quickly.  And in loops of the kind you used it is better to use the 
lowest level fitting functions after using the high-level function once to get 
the design matrix and response vector.  See lm.fit or the Hmisc library 
lm.fit.qr.bare function for example (but use lm.influence for your 
application).  -Frank Harrell

On Mon, 27 May 2002 09:39:02 +0200
Javier Seoane <seoane@ebd.csic.es> wrote:

> Thanks again to Dimitris C. Rizopoulos who suggested a different, more 
> efficient, way to obtain leave-one-out predictions for a series of models 
> (decreasing computing time from 250 to 49 min when three models were used). 
> His kind reply is copied below:
> 
> ======== Dimitris C. Rizopoulos (25/05/2002)
> model.1 <- lm(Fuel ~ Weight + Disp., fuel.frame)
> model.2 <- lm(Fuel ~ Weight + Mileage, fuel.frame)
> model.3 <- lm(Fuel ~ Weight + Type, fuel.frame)
> model.4 <- lm(Fuel ~ Weight + Type+Mileage,
> fuel.frame)
> model.5 <- lm(Fuel ~ Weight + Type+Mileage+Disp.,
> fuel.frame)
> models <- list(model.1, model.2,
> model.3,model.4,model.5) 
> ##########
> tic <- proc.time()
> n <- nrow(fuel.frame)
> predictions <- list(NULL)
> for(i in 1:n) {
> predictions[[i]] <- lapply(models, function(x){
> model <- update(x, data = fuel.frame[-i,])
> predict(model, newdata = fuel.frame[i,])})
> }
> tac <- proc.time() - tic
> results <- matrix(unlist(predictions), nrow=n,
> ncol=length(models), byrow=T)
> dimnames(results) <- list(paste("observation", 1:n),
> paste("model.", 1:length(models), sep=""))
> CV.results <- colSums((results-fuel.frame$Fuel)^2)
> tac
> results
> CV.results
> 
> ===================================
> Javier Seoane
> Department of Applied Biology
> Estación Biológica de Doñana, CSIC
> Avda. María Luisa s/n
> 41013, Sevilla
> SPAIN
> 
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news


-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat

<Prev in Thread] Current Thread [Next in Thread>