s-news
[Top] [All Lists]

Re: "by" question - summary of responses

To: <s-news@lists.biostat.wustl.edu>
Subject: Re: "by" question - summary of responses
From: "Kurbat, Matt BGI SF" <Matt.Kurbat@barclaysglobal.com>
Date: Mon, 17 Apr 2006 18:12:02 -0700
Thread-index: AcZfaUhvHTfhCgsGRoi3jt9O8t0lmADG0zDA
Thread-topic: "by" question
I've summarized responses to my original question (listed at bottom).  
Thanks to Matt Austin, Bill Dunlap, Richard Pugh, Rich Weber, and Kristina 
Kollen for their help.   

Matt



REVIEW AND SUMMARY
I wanted to compute various functions (min, max, mean, stdev, kurtosis, etc.) 
over groups defined in a "name" variable, for various variables in a dataframe, 
using the "by" function.   

My original posting is given below.  In brief I was trying to do something like:
        tempdata<-as.data.frame(matrix(NA,4,4))
        names(tempdata)<-c("yearmon","h0","h1","name")
        tempdata[1,]<-c(200602,1,4,"abvol")
        tempdata[2,]<-c(200603,2,3,"abvol")
        tempdata[3,]<-c(199106,3,2,"accr")
        tempdata[4,]<-c(199107,4,1,"accr")
by(as.data.frame(tempdata[, 1]), tempdata$name, min)
My result was
        Problem in NextMethod(.Generic): Can't find the generic
function "FUN". Use traceback() to see the call stack

Summary of suggestions:
** use "function(x) min(x)" instead of just plain "min"
** use sapply(tempdata,class)" to check data classes before applying functions. 
 Types can be converted to numeric using: 
        tempdata[,1]<-as.numeric(tempdata[,1]).
** for this type of analysis, I would recommend the use of the "aggregate" 
function.  The "aggregate" function works in a similar way to "by", but applies 
the function provided to the columns of the sub data frame.  
** if data are inappropriately non-numeric after being read in from an external 
file, use the argument na.string="." to the call to importData() or 
read.table() that read in this data, that column ought to be read in as numeric 
(with the period being read as a missing value).



>  -----Original Message-----
> From:         Kurbat, Matt BGI SF  
> Sent: Thursday, April 13, 2006 7:15 PM
> To:   's-news@lists.biostat.wustl.edu'
> Subject:      "by" question 
> 
> 
> 
> Hello, 
> 
> I've recently returned to Splus after a long layoff (forced SAS treatments).  
> I'm trying to apply "by" for various input functions and datasets and having 
> trouble getting it right.
> 
> For example, I can create a dataframe as follows: 
>       tempdata<-as.data.frame(matrix(NA,4,4))
>       names(tempdata)<-c("yearmon","h0","h1","name")
>       tempdata[1,]<-c(200602,1,4,"abvol")
>       tempdata[2,]<-c(200603,2,3,"abvol")
>       tempdata[3,]<-c(199106,3,2,"accr")
>       tempdata[4,]<-c(199107,4,1,"accr")
> With results that look like this:
>       > tempdata
>         yearmon h0 h1  name 
>       1  200602  1  4 abvol
>       2  200603  2  3 abvol
>       3  199106  3  2  accr
>    4  199107  4  1  accr
> 
> I want to compute various functions (min, max, mean, stdev, kurtosis, etc.)
> over the groups defined in the "name" variable using the "by" function.  
> On some data sets I can get "mean" to work fine but not the others.  
> 
> On the example above, I get the following results using "mean"
>       by(as.data.frame(tempdata[,1]), tempdata$name, mean)
>               > by(as.data.frame(tempdata[, 1]), tempdata$name, mean)
>               tempdata$name:abvol
>               [1] NA
>               -----------------------------------------------------------
>               tempdata$name:accr
>               [1] NA
> 
> On the example above, I get the following results using "min"
>       by(as.data.frame(tempdata[,1]), tempdata$name, mean)
>               > by(as.data.frame(tempdata[, 1]), as.factor(tempdata$name), 
> min)
>               Problem in NextMethod(.Generic): Can't find the generic 
> function "FUN" 
>               Use traceback() to see the call stack
>  
> Would someone please explain to me how I can make these work?  
> I'm running Splus 7.0 on Windows 2000.  
> 
> Thanks!
> Matt
> 
> 
> 
> 
> 
 
This message and any attachments are confidential, proprietary, and may be 
privileged.  If this message was misdirected, Barclays Global Investors (BGI) 
does not waive any confidentiality or privilege.  If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone.  Any distribution, use or copying of this 
e-mail or the information it contains by other than an intended recipient is 
unauthorized.  The views and opinions expressed in this e-mail message are the 
author's own and may not reflect the views and opinions of BGI, unless the 
author is authorized by BGI to express such views or opinions on its behalf.  
All email sent to or from this address is subject to electronic storage and 
review by BGI.  Although BGI operates anti-virus programs, it does not accept 
responsibility for any damage whatsoever caused by viruses being passed.

<Prev in Thread] Current Thread [Next in Thread>
  • Re: "by" question - summary of responses, Kurbat, Matt BGI SF <=