s-news
[Top] [All Lists]

Re: combine duplicated rows in a big data set

To: "'haibo tang'" <haibotang@yahoo.com>, <s-news@lists.biostat.wustl.edu>
Subject: Re: combine duplicated rows in a big data set
From: "Nick Ellis" <nick.ellis@marine.csiro.au>
Date: Mon, 9 Jul 2001 09:54:45 +1000
Importance: Normal
In-reply-to: <20010707024213.82753.qmail@web13304.mail.yahoo.com>
Reply-to: <nick.ellis@marine.csiro.au>
See attachments.

Nick Ellis
CSIRO Marine Research   mailto:Nick.Ellis@marine.csiro.au
PO Box 120                      ph    +61 (07) 3826 7260
Cleveland QLD 4163      fax   +61 (07) 3826 7222
Australia                       http://www.marine.csiro.au



> -----Original Message-----
> From: s-news-owner@lists.biostat.wustl.edu
> [mailto:s-news-owner@lists.biostat.wustl.edu]On Behalf Of haibo tang
> Sent: Saturday, 7 July 2001 12:42
> To: s-news@lists.biostat.wustl.edu
> Subject: [S] combine duplicated rows in a big data set
> 
> 
> Dear Splus Users,
> 
> When handling a big data set ( 200K * 26 ), I come
> across this problem. The data set ( M1 )looks like
> this:
> > M1
>   V1 V2 V3 
> 1  A  X  4
> 2  B  X  1
> 3  A  Y  2
> 4  A  X  3
> 5  B  Y  5
> 6  B  Z  6
> 7  C  Z  3
> 
> What I want is M2
> > M2
>   V1 V2 V3 
> 1  A  X  7
> 2  B  Y  6
> 3  A  Y  2
> 4  B  Z  6
> 5  C  Z  3
> 
> that is, I need to add up the V3's for rows with the
> same V1 and V2, for example, row 1 and 4, row 2 and 5
> in M1. 
> 
> Could anyone tell me how to do it efficiently?
> 
> Thank you.
> 
> Haibo  
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Get personalized email addresses from Yahoo! Mail
> http://personal.mail.yahoo.com/
> ---------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news
> 
--- Begin Message ---
To: <s-news@wubios.wustl.edu>
Subject: [S] Re:Summary: Alternatives to aggregate
From: \"Gérald Jean\" <Gerald.Jean@spgdag.ca>
Date: Wed, 8 Dec 1999 01:23:27 +1000
Importance: Normal
Sorry I forgot to include another function from Bill that is used both by
"agg.sum" and "agg.length".  Here it is.


"unname" <- function(x)
{
# Author : Bill Dunlap, from MathSoft.
# Date   : ???
# Purpose: will remove the name from it's argument "x".

 names(x)<-NULL
 x
}

Gérald Jean
Analyste-conseil (statistiques), Actuariat
télephone            : (418) 835-8839
télecopieur          : (418) 835-5865
courrier électronique: gerald.jean@spgdag.ca

"In God we trust all others must bring data"


-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news

--- End Message ---
--- Begin Message ---
To: <s-news@wubios.wustl.edu>
Subject: [S] Summary: Alternatives to aggregate?
From: \"Gérald Jean\" <Gerald.Jean@spgdag.ca>
Date: Wed, 8 Dec 1999 01:17:23 +1000
Importance: Normal
Hi everyone,

yesterday I posted a question to the list asking for an alternative to the
builtin function "aggregate".  I got a single answer, from
Bill Dunlap of MathSoft, his solution is amazingly fast for large data sets.
Following you will find the original posting and Bill's solution.

Thanks Bill,

Gérald Jean
Analyste-conseil (statistiques), Actuariat
télephone            : (418) 835-8839
télecopieur          : (418) 835-5865
courrier électronique: gerald.jean@spgdag.ca

"In God we trust all others must bring data"

---------------

Attachment: ATT00434.txt
Description: Text document


--- End Message ---
<Prev in Thread] Current Thread [Next in Thread>