s-news
[Top] [All Lists]

Re: problem with select.rows

To: Joseph LeBouton <lebouton@msu.edu>, s-news@lists.biostat.wustl.edu
Subject: Re: problem with select.rows
From: Andrew Robinson <andrewr@uidaho.edu>
Date: Wed, 24 Mar 2004 06:56:09 -0800
In-reply-to: <6.0.1.1.0.20040324091943.01dc4e00@mail.msu.edu>
Organization: University of Idaho
References: <6.0.1.1.0.20040324091943.01dc4e00@mail.msu.edu>
User-agent: KMail/1.5.4
Joseph,

I think that you'll find that it's factor that is tripping you up.  Even 
though you're sampling the observations, there's no way for S to know that 
you really want to think about only the levels included in your subsample.  
The factor itself hasn't been redefined.

One way to solve the problem is to redefine the factor each time you use it, 
dropping the levels.  In R this would be:

s$fruit <- factor(s$fruit)

From memory, Splus might not drop levels by default, so you might have to 
check the documentation.

Also, I'd be very surprised if your code below worked in R 1.8.1.  The 
underscore has been eliminated as an assignment operator.

Andrew

On Wednesday 24 March 2004 06:33, Joseph LeBouton wrote:
> Hello all,
>
> HELP! I'm running S-plus 6.2, but the problem is the same in R 1.8.1.  I
> apparently misunderstand the select.rows command.  Can anybody help me with
> this?
>
> I ran into a problem trying to select rows (of a factor variable) from a
> dataframe and use the selected subset of factors to create summary
> statistics and particularly barcharts. I keep getting barchart labels (and
> summary stats) for factors I excluded using the select.rows
> command.  Values associated with the non-selected factor values are zeros,
> but I want these factor values to disappear altogether from both the
> summary stats and the barcharts!   In the full dataset I want to select 6-8
> species from a total of 39 species.  I've included some sample code that
> replicates my problem.
>
> thanks,
>
> jlb
>
>
> ########
> #EXAMPLE CODE THAT REPLICATES PROBLEM:
>
>
> #Problem with select.rows to exclude observations from a barchart:
>
> #Create dataframe with 4 rows, 2 columns:
> r_data.frame(c(1:4))
> r_cbind(r, c("apple","apple","bananna","orange"))
>
> #Select rows of dataframe 'r' to work with; exclude "apple" from dataframe.
> s_select.rows(r, r[,2]!="apple")
>
> #summarize 'r' and 's'; notice that "apple" still appears in summary of
> 's', #  though it occurs zero times in the dataframe.
> summary(r)
> summary(s)
>
> #Do a bar chart of both dataframes.  Notice that "apple" appears in the
> # bar chart of both the 1st and 2nd dataframes, though all occurrences of
> # "apple" should have been excluded by the select.rows.
> barchart(r[,2]~r[,1])
> barchart(s[,2]~s[,1])
>
> ************************************************************************
> Joseph LeBouton
> PhD candidate
> Michigan State University
> Department of Forestry
> East Lansing, Michigan 48824
>
> (517) 355-7744
>
> lebouton@msu.edu
>
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news

-- 
Andrew Robinson                      Ph: 208 885 7115
Department of Forest Resources       Fa: 208 885 6226
University of Idaho                  E : andrewr@uidaho.edu
PO Box 441133                        W : http://www.uidaho.edu/~andrewr
Moscow ID 83843                      Or: http://www.biometrics.uidaho.edu
No statement above necessarily represents my employer's opinion.


<Prev in Thread] Current Thread [Next in Thread>