I'm using Splus 6.1 for Windows.
Without being an advanced user nor a proficient programmer of Splus, I am
trying to use CARTs to analyze vegetation and soil data from an ecological
study. I am using the rpart library in Splus to do my analyses and create
my graphs. I've got questions about implementing rpart as well as the
statistical underpinnings of CART. I'll just outline them below. If anyone
can help, I'd greatly appreciate it!
> Tree display
> I decided to transform my response variables (arcsin square root for
> percentages; log transformation for others) in order to stabilize the
> results. Is it possible to back-transform within the rpart functions
> before graphing the tree? If so, how do you do this? Is there syntax for
> the Splus plot.rpart graphing command that can do this? Or, do you have
> to transform the results separately and create a new file for plot.rpart
> (or replace numbers in the rpart file)?
>
Is there a way to print summary statistics (aside from the split value and
the n) for the nodes? Again, I am wondering if there is a way to do this
through the plot.rpart command or this has to be added to the graph
externally. The same question holds for printing the proportion of total
sum of squares explained by each split - how can you display this? (I
realize you can just graph the branches with their variable lengths instead
of uniformly, but I am wondering how you would write the command to extract
the proportion of the total sum of squares...)
> CART Procedures
Is it possible to denote stratified samples or sampling units in the CART
analysis? For instance, if multiple soil samples come from one site and you
have multiple sites, can you test for differences of all the samples, or do
you have to use means from the sites? I have read one study that used a
stratification based on coral reefs; reefs were not used as an explanatory
variable, but rather "to form subsets for cross validations" since they were
sampling units. Unfortunately, I have not figured out how to implement
this. Any guidance?
In the case of substitution of explanatory variables, how do you choose an
alternative variable, graph it in the tree, and recalculate the statistics
in Splus?
I'd be happy to provide more details about my study if anyone wants to
discuss this further. Thanks in advance for your time, consideratio, and
insight!
> Kind regards,
> Nancy Golubiewski
> Doctoral Candidate
> Cooperative Institute for Research in Environmental Sciences and
> Department of Ecology and Evolutionary Biology
> University of Colorado-Boulder
> Boulder, CO 80309 USA
|