I am new to the classification and regression tree procedure
in Splus and ran into some problems. My dataset is derived from satellite
imagery where the dependent variable is either a class (grass, water) for classification
trees or percent cover for regression trees. I ran the tree analysis which gave
me 24 terminal nodes, then ran the cross validation to determine optimal
pruning. Based on that, I decided on 12 terminal nodes. When I run the analysis
again, this time with cost complexity pruning, choosing the size of the tree as
10, the results in the report window still show the 24 terminal nodes, although
the graph window shows me the tree with 10 nodes. How do I get the results
window to show the 10 node tree model?
Also, some classes show up in more than 1 end node, such as
2 grass classes. How do I know which rule to pick for the grass class, for
example? I assume I should pick the highest probability among the classes, i.e.
the yprob shown in brackets in the tree model, but how do I know the order of
class probability shown in the model?
When I run a regression tree for percent cover, the nodes
show a percent cover value. Are those values the mean percent cover values and
how do you determine the boundaries of percent cover classes from that?
Thanks in advance,
Andrea Laliberte