Dear S-Plus people,
Regarding my question on getting different trees (with rpart, after
cross-validation
and the 1-SE rule) with different setting of minbucket in rpart.control, Andy
Liaw
replied to me that it's normal to expect not necessarily to get the same trees.
Then
my question is: how does one choose which setting of minbucket is appropriate
for
their problem? Because I get quite different results with minbucket=1 then if I
leave
it on the default. Is a tree pruned from the "total" tree (i.e. minbucket=1)
better
reflecting the data? My interest is not in prediction but description. I have
two data
sets, one of 170 observations and the other of 506.
Thanks again,
Isabelle Robichaud
p.s. S-Plus 6.1 on Win98.
|