Thanks to Sundar Dorai-Raj for his recommendations!
--Reid
Try using this:
by(DF1,DF1[,colname],colSums)
This will compute the sum of all columns. But, you mentioned you just
wanted "Count". In that case, why not use tapply or aggregate?
Regards,
Sundar
"Gilliam, Reid" wrote:
>
> Hi folks,
>
> Suppose I have a data frame DF1 with columns "Count", "V1", "V2", and
"V3".
> I want to compute the sum of "Count" for each of the unique values for
"V1",
> "V2", and "V3". I have decided to use the by() function to do this as
> follows:
>
> by(DF1,DF1$"V1",colSums)
> by(DF1,DF1$"V2",colSums)
> by(DF1,DF1$"V3",colSums)
>
> This works fine. However, as I have some data frames with unknown column
> names, I would like to automate this process as follows:
>
> for (i in 1:ncol(DF1)) {
> colname <- names(DF1[i])
> dfcolumn <- paste("DF1$",colname,sep="")
> by(DF1,dfcolumn,colSums)
> }
>
> The problem with this code is that the by() function requires the second
> parameter to be the INDICES (i.e., in my case the column of the data frame
> DF1). What I've passed to the by() function is a character string
> representation of these INDICES. So, the by() function call ends up
> processing as: by(DF1,"DF1$V1",colSums) instead of by(DF1,DF1$V1,colSums).
>
> Does anyone know how I could define "dfcolumn" in the for loop above so
that
> the by() function will be correct?
>
> Thanks,
> Reid Gilliam
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
--
Sundar Dorai-Raj
PDF Solutions, Inc.
Dallas TX
|