On Sat, 23 Feb 2002, Wing, Michael wrote:
> I'm using S-Plus 2000 (Release 3) to create regression trees and am
> interested in investigating surrogate splits. I would like to explicitly
> specify a primary split from my candidate variables and allow the software
> to create trees based on this primary split. Any suggestions would be
> welcome.
This is assuming that you are using the built in tree functions. There is
also the Rpart library which has some improvements, but I don't know it
well enough to say if or how to do the same things in Rpart.
For example lets assume you are using the dataset fuel.frame and have
created the initial tree using:
> fuel.tree <- tree( Mileage ~ Weight + Disp., data=fuel.frame)
you can examine the effects of different splits by using burl.tree()
> tree.screens()
> plot(fuel.tree,type="u")
> text(fuel.tree)
> fuel.tree.burl <- burl.tree(kyph.tree)
This puts you into an interactive mode with the graph. Click on a split
in the graph and at the bottom of the screen you will see the effect of
all possible splits at that point. The x-value of the graph indicates the
splitting value (e.g. Disp. < 134) and the height of the lines is how much
the deviance is reduced by that split. This will give you a quick
graphical view of other splits that may also be good. Click on other
splits to see the effect. Right click when you want to quit the
interactive plot. Now fuel.tree.burl contains the information from the
last split you clicked on in case you want to compare the actual numbers
rather than just the graphs.
Now that you have a different split that you want to try, you can use
edit.tree() to change the split and regrow the tree beyond that point.
> fuel.tree2 <- edit.tree(fuel.tree, node=1, var="Weight",
+ splitl = 2567.5)
Now, fuel.tree2 is forced to have the first split based on Weight < 2567.5
and follows the standard pattern below that. You can edit other nodes
alse.
Also look at identify.tree and browser.tree for other possible helps.
hope this helps,
--
Greg Snow, PhD Office: 223A TMCB
Department of Statistics Phone: (801) 378-7049
Brigham Young University Dept.: (801) 378-4505
Provo, UT 84602 email: gls@byu.edu
|