s-news
[Top] [All Lists]

by() function

To: s-news@lists.biostat.wustl.edu
Subject: by() function
From: "Gilliam, Reid" <rbgilliam@att.com>
Date: Thu, 31 Oct 2002 15:38:04 -0500
Thanks to Sundar Dorai-Raj for his recommendations!

--Reid


Try using this:

by(DF1,DF1[,colname],colSums)

This will compute the sum of all columns. But, you mentioned you just
wanted "Count". In that case, why not use tapply or aggregate?

Regards,
Sundar

"Gilliam, Reid" wrote:
> 
> Hi folks,
> 
> Suppose I have a data frame DF1 with columns "Count", "V1", "V2", and
"V3".
> I want to compute the sum of "Count" for each of the unique values for
"V1",
> "V2", and "V3".  I have decided to use the by() function to do this as
> follows:
> 
> by(DF1,DF1$"V1",colSums)
> by(DF1,DF1$"V2",colSums)
> by(DF1,DF1$"V3",colSums)
> 
> This works fine.  However, as I have some data frames with unknown column
> names, I would like to automate this process as follows:
> 
> for (i in 1:ncol(DF1)) {
> colname <- names(DF1[i])
> dfcolumn <- paste("DF1$",colname,sep="")
> by(DF1,dfcolumn,colSums)
> }
> 
> The problem with this code is that the by() function requires the second
> parameter to be the INDICES (i.e., in my case the column of the data frame
> DF1).  What I've passed to the by() function is a character string
> representation of these INDICES.   So, the by() function call ends up
> processing as: by(DF1,"DF1$V1",colSums) instead of by(DF1,DF1$V1,colSums).
> 
> Does anyone know how I could define "dfcolumn" in the for loop above so
that
> the by() function will be correct?
> 
> Thanks,
> Reid Gilliam
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news

-- 
Sundar Dorai-Raj
PDF Solutions, Inc.
Dallas TX

<Prev in Thread] Current Thread [Next in Thread>
  • by() function, Gilliam, Reid <=