following is an example I copied from my blog and HTH.
CALCULATE GROUP SUMMARY IN R
##################################################
# HOW TO CALCULATE GROUP SUMMARY IN R #
##################################################
# EQUIVALENT SAS CODE: #
# #
# DATA DATA; #
# DO I = 1 TO 2; #
# DO J = 1 TO 4; #
# GROUP = 'TREATMENT_'||PUT(I, 1.); #
# X = RANNOR(1); #
# OUTPUT; #
# END; #
# END; #
# KEEP GROUP X; #
# RUN; #
# #
# PROC SQL; #
# CREATE TABLE COMBINE AS #
# SELECT *, MEAN(X) AS MEAN_X, SUM(X) AS SUM_X #
# FROM DATA #
# GROUP BY GROUP; #
# QUIT; #
##################################################
# GENERATE A TREATMENT GROUP #
group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_'));
# CREATE A SERIES OF RANDOM VALUES #
x<-rnorm(length(group));
# CREATE A DATA FRAME TO COMBINE THE ABOVE TWO #
data<-data.frame(group, x);
# CALCULATE SUMMARY FOR X #
x.mean<-tapply(data$x, data$group, mean, na.rm = T);
x.sum<-tapply(data$x, data$group, sum, na.rm = T);
# CREATE A DATA FRAME TO COMBINE SUMMARIES #
summ<-data.frame(x.mean, x.sum, group = names(x.mean));
# COMBINE DATA AND SUMMARIES TOGETHER #
combine<-merge(data, summ, by = "group");
On 3/14/07, Neung-Hwan Oh <ultisol@gmail.com> wrote:
Thanks a lot for the quick replies.
But, doesn't "tapply" or "aggregate" provide a matrix format rather
than data frame format?
On 3/14/07, Rich@mango-solutions.com <Rich@mango-solutions.com> wrote:
> Sorry ...
>
> I mean "aggregate(df$value, df[,1:3], mean)"
>
> Rich.
> mangosolutions
>
> -----Original Message-----
> From: Rich@Mango-Solutions.com [mailto:Rich@Mango-Solutions.com]
> Sent: 14 March 2007 21:14
> To: 'Wensui Liu'; 'Neung-Hwan Oh'
> Cc: s-news@wubios.wustl.edu
> Subject: Re: [S] group by
>
> Yes ... you can use tapply in S. For the structure you're using, I'd
> probably recommend "aggregate" though ...
>
> Something like "aggregate(df[,1:3], df$value, mean)"
>
> Rich.
> mangosolutions
>
> -----Original Message-----
> From: Wensui Liu [mailto:liuwensui@gmail.com]
> Sent: 14 March 2007 21:02
> To: Neung-Hwan Oh
> Cc: s-news@wubios.wustl.edu
> Subject: Re: [S] group by
>
> I am not sure if in Splus, there is a nice function like tapply() in R or
> not.
>
> On 3/14/07, Neung-Hwan Oh <ultisol@gmail.com> wrote:
> > Hello,
> >
> > How can you calculate the following example in s-plus? In Access, it is
> > relatively easy with "Group By" and I am wondering whether there is a
> > similar function that I missed in S-Plus.
> >
> >
> >
> > "From this table"
> >
> > site.no date time value
> >
> > 1 1989/04/27 12:00 1.0
> >
> > 2 1975/10/01 19:00 2.0
> >
> > 2 1975/10/01 20:00 4.0
> >
> > 3 1993/04/10 09:00 3.0
> >
> > 3 1993/04/10 12:00 6.0
> >
> > 3 1993/04/10 15:00 9.0
> >
> >
> >
> > "To this (averages per date per site)" + (count column?)
> >
> > 1 1989/04/27 12:00 1.0 (1.0)
> >
> > 2 1975/10/01 19:30 3.0 (2.0)
> >
> > 3 1993/04/10 12:00 6.0 (3.0)
> >
> >
> >
> > Many thanks!
> >
> > NH
>
>
> --
> WenSui Liu
> A lousy statistician who happens to know a little programming
> (http://spaces.msn.com/statcompute/blog)
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
>
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
>
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
>
--
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)
|