Dear S-Plus people,
I am doing a regression tree analysis for my boss using rpart and S-Plus 6.1. I
am
doing 10-fold cross-validations and repeating my analyses to choose the modal
best-sized tree (chosen with the 1-SE rule). My querry is that whether I leave
the
rpart controls as their default values, or I change minbucket=1 to obtain a
"fully-
leafed" tree, I end up with different best-size trees. My pruned tree with
default
rpart.control values has one split but the one with minbucket=1 has three. I
though
that choosing minbucket=1 would only grow the same tree even further, that
pruning
would prune my tree back to the same result, not a different one. Can anyone
advise on how to choose the best tree in this case?
Thanks,
Isabelle Robichaud
|