s-news
[Top] [All Lists]

Re: Force a split in Rpart3

To: s-news@lists.biostat.wustl.edu, lsdst8+@pitt.edu
Subject: Re: Force a split in Rpart3
From: Terry Therneau <therneau@mayo.edu>
Date: Tue, 18 Jan 2005 14:20:08 -0600 (CST)
Reply-to: Terry Therneau <therneau@mayo.edu>
Query:

I would like to force a specific variable to always be the first split 
criterion 
in rpart3, somewhat analagous to forcing a variable into a stepwise regression. 
I have not come across this in any of the literature (I might have just missed 
it). Can this be done? If so what steps do I have to take? Thanks

Scott

-----------

  One solution is to use the "cost" argument.  Make the variable of interest
very "cheap" and all the others costly.  Then the program is guarranteed to
choose your variable of interest first.  If you want to force a particular
split point on that variable, just replace it with a yes/no one.  The
default is to have all costs=1, so set the chosen variable to have .001
or some other small number.  Cost only affects which variable is chosen,
but none of the other statistics of a split.

  (Cost was intended so we could penalize some variables that were
very expensive to collect.  This is a new and interesting use for it.)

        Terry Therneau
        

<Prev in Thread] Current Thread [Next in Thread>