In case anyone agrees, I have a method in the S Hmisc
library for making SAS code out of S functions (at
least some of them) called sascode. If you wanted
to call the function sascode.rpart that would
fit in nicely with Hmisc. -Frank Harrell
Terry Therneau wrote:
>
> output, i.e., an Splus function that would write the program for him.
> There isn't such a function, but it was an interesting exercise to do so.
> (No, I didn't really have time for this, but it was an interesting puzzle
> over lunch hour.)
> Below is a modified "print.rpart" function that does what he desires.
> 1. Some of the "if" statements could be replaced by "else" statements,
> but that looks like too much work to figure out.
>
> 2. The output is not correct for a categorical variable: it prints
> out the SAS code "if (Type=Large,Medium,Van)" for instance. I'll
> let someone else add that enhancement.
>
> -----------------------------------------------------
> S-PLUS : Copyright (c) 1988, 2000 MathSoft, Inc.
> S : Copyright Lucent Technologies, Inc.
> Version 6.0 Release 1 for Sun SPARC, SunOS 5.6 : 2000
> Working data will be in .Data
> > fit
> n= 228
>
> node), split, n, loss, yval, (yprob)
> * denotes terminal node
>
> 1) root 228 63 2 (0.2763158 0.7236842)
> 2) time>=171.5 169 61 2 (0.3609467 0.6390533)
> 4) age< 45.5 10 2 1 (0.8000000 0.2000000) *
> 5) age>=45.5 159 53 2 (0.3333333 0.6666667)
> 10) time< 304 76 32 2 (0.4210526 0.5789474)
> 20) ph.ecog< 1.5 62 30 2 (0.4838710 0.5161290)
> 40) age< 69.5 48 22 1 (0.5416667 0.4583333)
> 80) time>=219.5 29 10 1 (0.6551724 0.3448276) *
> 81) time< 219.5 19 7 2 (0.3684211 0.6315789) *
> 41) age>=69.5 14 4 2 (0.2857143 0.7142857) *
> 21) ph.ecog>=1.5 14 2 2 (0.1428571 0.8571429) *
> 11) time>=304 83 21 2 (0.2530120 0.7469880)
> 22) time>=802.5 7 2 1 (0.7142857 0.2857143) *
> 23) time< 802.5 76 16 2 (0.2105263 0.7894737) *
> 3) time< 171.5 59 2 2 (0.0338983 0.9661017) *
>
> > rparttosas(fit)
> if (time>=171.5) then do;
> if (age< 45.5) then rpnode= 4 ;
> if (age>=45.5) then do;
> if (time< 304) then do;
> if (ph.ecog< 1.5) then do;
> if (age< 69.5) then do;
> if (time>=219.5) then rpnode= 80 ;
> if (time< 219.5) then rpnode= 81 ;
> end;
> if (age>=69.5) then rpnode= 41 ;
> end;
> if (ph.ecog>=1.5) then rpnode= 21 ;
> end;
> if (time>=304) then do;
> if (time>=802.5) then rpnode= 22 ;
> if (time< 802.5) then rpnode= 23 ;
> end; end; end;
> if (time< 171.5) then rpnode= 3 ;
>
> ---------------------
>
> The idea is to create variable rpnode, which has the value of the starred
> lines in the printout.
>
> Terry Therneau
>
> -------------------------------------------------
> #
> # Here is a special program written at the behest of a user.
> # He wants a SAS program that will create the predicted node number
> # for each subject.
> # The code is 99% a copy of print.rpart.
> rparttosas <- function(x, minlength=0, spaces=3, cp,
> digits=.Options$digits, ...) {
> if(!inherits(x, "rpart")) stop("Not legitimate rpart object")
> if (!is.null(x$frame$splits)) x <- rpconvert(x) #help for old objects
>
> if (!missing(cp)) x <- prune.rpart(x, cp=cp)
> frame <- x$frame
> node <- as.numeric(row.names(frame))
> depth <- tree.depth(node)
> indent <- paste(rep(" ", spaces * 32), collapse = "")
> #32 is the maximal depth
> if(length(node) > 1) {
> indent <- substring(indent, 1, spaces * 1:max(depth))
> indent <- c("", indent[depth])
> }
> else stop("Tree has only 1 node")
>
> # this is the ending part of each line
> term <- ifelse(frame$var == "<leaf>", paste("then rpnode=", node, ";"),
> "then do;")
> # the first part of the line
> z <- labels(x, digits=digits, minlength=minlength, ...)
> z <- paste(indent, "if (", z, ") ", sep='')
>
> # add in the "end" statements that go with each "do"
> delta <- -diff(c(depth,1)) #leftward movement of the indent
> temp <- paste(rep("end;", max(delta)), collapse=' ')
> endlist <- substring(temp, 1, 5*delta)
> endstring <- ifelse(delta>0, paste("\n", indent, endlist, sep=''), "")
> z <- paste(z, term, endstring, sep='')
>
> cat(z[-1], sep = "\n") # the -1 prevents listing the root node
> return(invisible(x))
> }
>
> ---------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
--
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
|