Query:
I would like to force a specific variable to always be the first split
criterion
in rpart3, somewhat analagous to forcing a variable into a stepwise regression.
I have not come across this in any of the literature (I might have just missed
it). Can this be done? If so what steps do I have to take? Thanks
Scott
-----------
One solution is to use the "cost" argument. Make the variable of interest
very "cheap" and all the others costly. Then the program is guarranteed to
choose your variable of interest first. If you want to force a particular
split point on that variable, just replace it with a yes/no one. The
default is to have all costs=1, so set the chosen variable to have .001
or some other small number. Cost only affects which variable is chosen,
but none of the other statistics of a split.
(Cost was intended so we could penalize some variables that were
very expensive to collect. This is a new and interesting use for it.)
Terry Therneau
|