Joseph,
I think that you'll find that it's factor that is tripping you up. Even
though you're sampling the observations, there's no way for S to know that
you really want to think about only the levels included in your subsample.
The factor itself hasn't been redefined.
One way to solve the problem is to redefine the factor each time you use it,
dropping the levels. In R this would be:
s$fruit <- factor(s$fruit)
From memory, Splus might not drop levels by default, so you might have to
check the documentation.
Also, I'd be very surprised if your code below worked in R 1.8.1. The
underscore has been eliminated as an assignment operator.
Andrew
On Wednesday 24 March 2004 06:33, Joseph LeBouton wrote:
> Hello all,
>
> HELP! I'm running S-plus 6.2, but the problem is the same in R 1.8.1. I
> apparently misunderstand the select.rows command. Can anybody help me with
> this?
>
> I ran into a problem trying to select rows (of a factor variable) from a
> dataframe and use the selected subset of factors to create summary
> statistics and particularly barcharts. I keep getting barchart labels (and
> summary stats) for factors I excluded using the select.rows
> command. Values associated with the non-selected factor values are zeros,
> but I want these factor values to disappear altogether from both the
> summary stats and the barcharts! In the full dataset I want to select 6-8
> species from a total of 39 species. I've included some sample code that
> replicates my problem.
>
> thanks,
>
> jlb
>
>
> ########
> #EXAMPLE CODE THAT REPLICATES PROBLEM:
>
>
> #Problem with select.rows to exclude observations from a barchart:
>
> #Create dataframe with 4 rows, 2 columns:
> r_data.frame(c(1:4))
> r_cbind(r, c("apple","apple","bananna","orange"))
>
> #Select rows of dataframe 'r' to work with; exclude "apple" from dataframe.
> s_select.rows(r, r[,2]!="apple")
>
> #summarize 'r' and 's'; notice that "apple" still appears in summary of
> 's', # though it occurs zero times in the dataframe.
> summary(r)
> summary(s)
>
> #Do a bar chart of both dataframes. Notice that "apple" appears in the
> # bar chart of both the 1st and 2nd dataframes, though all occurrences of
> # "apple" should have been excluded by the select.rows.
> barchart(r[,2]~r[,1])
> barchart(s[,2]~s[,1])
>
> ************************************************************************
> Joseph LeBouton
> PhD candidate
> Michigan State University
> Department of Forestry
> East Lansing, Michigan 48824
>
> (517) 355-7744
>
> lebouton@msu.edu
>
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
--
Andrew Robinson Ph: 208 885 7115
Department of Forest Resources Fa: 208 885 6226
University of Idaho E : andrewr@uidaho.edu
PO Box 441133 W : http://www.uidaho.edu/~andrewr
Moscow ID 83843 Or: http://www.biometrics.uidaho.edu
No statement above necessarily represents my employer's opinion.
|