s-news
[Top] [All Lists]

Summary: Updating a data frame inside a loop

To: s-news@lists.biostat.wustl.edu
Subject: Summary: Updating a data frame inside a loop
From: "Javier Seoane" <seoane@ebd.csic.es>
Date: Fri, 14 Dec 2001 17:48:09 +0100
Reply-to: seoane@ebd.csic.es
Many thanks to the blistering fast answers of Leonid Gibiansky and 
Gérald Jean. They both kindly point me out to change "i" for i in 
expressions such as original.matrix[tmp.folds=="i",] (I am 
ashamed of what should have been obvious to me; thank you for 
your patience with a novice). What follows is the corrected version 
of my code (that works) and the original question.

Corrected version of code: 

predictions <- cbind("id"=original.matrix[,"id"], 
"new.predictions"=NA)   

for (i in 1:max(tmp.folds)){    
test.matrix <- original.matrix[tmp.folds==i,]    
training.matrix <- original.matrix[tmp.folds!=i,]
tmp.model <- update(final.model, data=training.matrix) 
new.predictions <- predict.gam(tmp.model, test.matrix, 
type="response") 
predictions[match(test.matrix[,"id"], predictions[,"id"]), 
"new.predictions"] <- new.predictions    
}


The original question:

Dear S-plus users:

I am trying to update a column in a data frame with the values that 
result from a for loop, but I do not manage to make the loop work. 
My final aim is to get predictions from a glm with a ten-fold 
crossvalidation scheme, and then to estimate some measure of 
discrimination ability such as kappa or AUC (I know that I could 
use function validate from Frank Harrell's Design library to get 
Somer's Dxy --related to AUC--, but the function will not give me 
kappa).
This is my clumsy code:

> version
S-PLUS 2000 Professional Edition Release 2 for Microsoft 
Windows : 1999 

predictions <- cbind("id"=original.matrix[,"id"], 
"new.predictions"=NA)   # these NA's are what I want to update


for (i in 1:max(tmp.folds)){    # where tmp.folds goes from 1 to 10
                                # "i in 1:10" is simpler but less generalizable 
test.matrix <- original.matrix[tmp.folds=="i",]  
training.matrix <- original.matrix[tmp.folds!="i",]
tmp.model <- update(final.model, data=training.matrix) # final 
model is glm previously fitted
new.predictions <- predict.gam(tmp.model, test.matrix, 
type="response") # predictions on the i fold
predictions[match(test.matrix[,"id"], predictions[,"id"]), 
"new.predictions"] <- new.predictions    
}


The loop gives me this error:
Error in safe.predict.gam(object, newdata, type, se.fit, terms): 0 
rows in newdata

It seems that "newdata" is faulty, but here "newdata" is "test 
matrix", which is defined at the top of each loop. What surprises 
me more is that the code works if I set the i value to 1, run the 
code, then set i to 2, run again the code, and so on. 

Maybe I am trying to rediscover the wheel and there is a function to 
do what I want (I could not find it in S-news, though). Can you point 
me out some specific reading as to this regard or suggest what 
may be wrong in the loop?

Any help would be appreciated.




Javier Seoane
Estacion Biologica de Doñana
Departamento de Biología Aplicada
Avda. Maria Luisa s/n (Pabellon del Peru)
Sevilla 41013
España (SPAIN)
Tel: + 034 95 423 23 40 
seoane@ebd.csic.es

<Prev in Thread] Current Thread [Next in Thread>
  • Summary: Updating a data frame inside a loop, Javier Seoane <=