Dear list
A couple of problems with running rpart() under S-Plus 6.2. Consider
the following model:
> temp.model
n= 369
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 369 83 1 (0.2249322 0.7750678)
2) as.factor(Q7.02)=N 95 42 1 (0.4421053 0.5578947)
4) as.factor(Q3.06)=N 20 3 0 (0.8500000 0.1500000) *
5) as.factor(Q3.06)=Y 75 25 1 (0.3333333 0.6666667) *
3) as.factor(Q7.02)=Y 274 41 1 (0.1496350 0.8503650)
6) as.factor(Q3.01)=N 98 32 1 (0.3265306 0.6734694)
12) as.factor(Q1.01)=Y 14 4 0 (0.7142857 0.2857143) *
13) as.factor(Q1.01)=N 84 22 1 (0.2619048 0.7380952) *
7) as.factor(Q3.01)=Y 176 9 1 (0.0511363 0.9488636) *
and a single row of a dataframe:
> Training[i, ]
Species Q1.01 Q1.02 Q1.03 Q2.01 Q2.02 Q2.03 Q2.04 Q3.01
Q3.02 Q3.03 Q3.04 Q3.05 Q4.01 Q4.02 Q4.03 Q4.04 Q4.05 Q4.06 Q4.07
X2 Abies nordmanniana Y Y N 1 0 N Y Y
Y N N N N N N N NA NA NA
Q4.08 Q4.09 Q4.1 Q4.11 Q4.12 Q5.01 Q5.02 Q5.03 Q5.04 Q6.01 Q6.02
Q6.03 Q6.04 Q6.05 Q6.06 Q6.07 Q7.01 Q7.02 Q7.03 Q7.04 Q7.05 Q7.06 Q7.07
X2 N NA N NA N N N NA NA NA NA
Y N N Y 0 NA N Y N Y N N
Q7.08 Q8.01 Q8.02 Q8.03 Q8.04 Q8.05 Q3.06 Q7.09 Q6.07.Factor
Weed.Class Binary.Outcome
X2 N NA NA NA NA NA Y N short
0 0
First, predict() doesn't like predicting for a single observation:
> predict(object = temp.model, newdata = Training[i, ], type = "prob")
Problem in dimnames(pred) <- list(names(where), ylevels): Cannot have
dimnames for nonarray
Use traceback() to see the call stack
However, predic() will happily predict for two or more rows:
> predict(object = temp.model, newdata = rbind(Training[i, ],
Training[i, ]), type = "prob")
0 1
X2 0.85 0.15
X21 0.85 0.15
However, the predictions are incorrect, as an observation with Q7.02="N"
and Q3.06="Y" should end up at a terminal node with fitted probabilities
of (0.3333333 0.6666667), not the returned (0.85, 0.15).
Running the same code in R2.0 provides the seemingly correct answer:
> predict(object=temp.model,newdata=Training[i,],type="prob")
0 1
2 0.3333333 0.6666667
Can anybody help with the S of things [I know what the R users will
say!]
cheers
Peter
*********************************************************************
Dr Peter Caley
CSIRO Entomology
GPO Box 1700, Canberra,
ACT 2601
Email: peter.caley@csiro.au
Ph: +61 (0)2 6246 4076 Fax: +61 (0)2 6246 4000
*********************************************************************
|