Many thanks to the blistering fast answers of Leonid Gibiansky and
Gérald Jean. They both kindly point me out to change "i" for i in
expressions such as original.matrix[tmp.folds=="i",] (I am
ashamed of what should have been obvious to me; thank you for
your patience with a novice). What follows is the corrected version
of my code (that works) and the original question.
Corrected version of code:
predictions <- cbind("id"=original.matrix[,"id"],
"new.predictions"=NA)
for (i in 1:max(tmp.folds)){
test.matrix <- original.matrix[tmp.folds==i,]
training.matrix <- original.matrix[tmp.folds!=i,]
tmp.model <- update(final.model, data=training.matrix)
new.predictions <- predict.gam(tmp.model, test.matrix,
type="response")
predictions[match(test.matrix[,"id"], predictions[,"id"]),
"new.predictions"] <- new.predictions
}
The original question:
Dear S-plus users:
I am trying to update a column in a data frame with the values that
result from a for loop, but I do not manage to make the loop work.
My final aim is to get predictions from a glm with a ten-fold
crossvalidation scheme, and then to estimate some measure of
discrimination ability such as kappa or AUC (I know that I could
use function validate from Frank Harrell's Design library to get
Somer's Dxy --related to AUC--, but the function will not give me
kappa).
This is my clumsy code:
> version
S-PLUS 2000 Professional Edition Release 2 for Microsoft
Windows : 1999
predictions <- cbind("id"=original.matrix[,"id"],
"new.predictions"=NA) # these NA's are what I want to update
for (i in 1:max(tmp.folds)){ # where tmp.folds goes from 1 to 10
# "i in 1:10" is simpler but less generalizable
test.matrix <- original.matrix[tmp.folds=="i",]
training.matrix <- original.matrix[tmp.folds!="i",]
tmp.model <- update(final.model, data=training.matrix) # final
model is glm previously fitted
new.predictions <- predict.gam(tmp.model, test.matrix,
type="response") # predictions on the i fold
predictions[match(test.matrix[,"id"], predictions[,"id"]),
"new.predictions"] <- new.predictions
}
The loop gives me this error:
Error in safe.predict.gam(object, newdata, type, se.fit, terms): 0
rows in newdata
It seems that "newdata" is faulty, but here "newdata" is "test
matrix", which is defined at the top of each loop. What surprises
me more is that the code works if I set the i value to 1, run the
code, then set i to 2, run again the code, and so on.
Maybe I am trying to rediscover the wheel and there is a function to
do what I want (I could not find it in S-news, though). Can you point
me out some specific reading as to this regard or suggest what
may be wrong in the loop?
Any help would be appreciated.
Javier Seoane
Estacion Biologica de Doñana
Departamento de Biología Aplicada
Avda. Maria Luisa s/n (Pabellon del Peru)
Sevilla 41013
España (SPAIN)
Tel: + 034 95 423 23 40
seoane@ebd.csic.es
|