s-news
[Top] [All Lists]

aggregate() function

To: s-news@lists.biostat.wustl.edu
Subject: aggregate() function
From: Eric yang <yang_eric9@yahoo.com>
Date: Sun, 8 Apr 2007 10:56:41 -0700 (PDT)
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=w5V5lLmPZObU6Rv0W7KaqCHI4eOV0Bct0nt3py9yVowIgX9XNH7LYbRdXWMjqFbOgyhvkfGoKvf+ZYT66Ea79GeYDxXDkFRCOVh419fyCoiYuYn8ECxs4gJSUO6B7XmUWSlnoms4twIPVqF13wRxV1bmC1RA5U6+j+Qb1H1HK/w=;
Dear members,

Suppose I have the following data:

my.data <- matrix(      
c("Group1", 24, 26, 100, 0.1, 0.15, 0.35, 0.4, 250,
"Group1", 35, 35, 102, 0.3, 0.4,  0.15, 0.15, 100,
"Group1", 33, 10, 88,  0.25, 0.25, 0.25, 0.25, 60,
"Group1", 80, 67, 10,  0.1, 0.1,  0.5,   0.4, 300,
"Group2", 10,  5,  6,  0.2, 0.3,  0.4,   0.1,  100,
"Group2", 5,  5,   5,  0.6, 0.15,  0.15, 0.15, 500,
"Group3", 78, 35, 34,  0.1, 0.25, 0.25, 0.4, 355,
"Group3", 67, 23, 234, 0.15, 0.15, 0.15, 0.55, 200),
ncol=9, byrow=T)
                                        
my.data <- as.data.frame(my.data)
names(my.data) <- c("GROUP", paste("A",1:3,sep=""),
paste("Percent", 1:4, sep=""), "my.weight")


I can use the aggregate function to calculate the mean
for each group over the various columns, i.e.

aggregate(my.data[, 2:ncol(my.data)],
list(Group=my.data[, 1]), mean)

I would like to calculate the mean for the columns
"A1"-"A3", a weighted mean for columns
"Percent1"-"Percent4" (weighted by my.weight) and the
sum for the column "my.weight".

What I would like to create is the following data:

Group   A1   A2    A3   Percent1   Percent2   Percent3
  Percent4   my.weight 
Group1 43.0 34.5  75.0  0.1408451  0.1725352 
0.3767606  0.3521127         710
Group2  7.5  5.0   5.5  0.5333333  0.1750000 
0.1916667  0.1416667         600
Group3 72.5 29.0 134.0  0.1180180  0.2139640 
0.2139640  0.4540541         555

i.e. for Group1, A1=(24+35+33+80)/4=43 (similarly for
A2 and A3)
Percent1<- c(0.1, 0.15, 0.35, 0.4)*250 + c(0.3, 0.4, 
0.15, 0.15)*100 +c( 0.25, 0.25, 0.25, 0.25)*60 +
c(0.1, 0.1,  0.5,   0.4)*300
Percent1 <- Percent1/710
my.weight = 250+100+60+300=710

Is there a neat way to create this type of data set in
SPlus using the aggregate function?

Any help is much appreciated.
Regards,
Eric





 
____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

<Prev in Thread] Current Thread [Next in Thread>
  • aggregate() function, Eric yang <=