On Mon, 21 Jul 2003 12:40:38 -0700
Wim Kimmerer <kimmerer@sfsu.edu> wrote:
> Splusers (v. 61.1 windows 98 PIII with 256MB):
>
> I have a data frame with about 23000 records and 7 columns. I want to get
> sums of the last 2 columns by unique combinations of the first 5 columns,
> which results in about 18000 records.
>
> So... I used aggregate, and when I couldn't get any work done for several
> hours because the computer was humming and grinding away at this problem, I
> nuked Splus, exported the data, imported into Access (YUK) and ran a query
> which took.... well I don't know but it was less than a second.
>
> I looked at Hmisc for alternatives: there is a function called summarize,
> but that only works for functions that return >1 value (as far as I know),
> and will not run the function on more than one data value (you can give it
> a matrix but for each combination of the grouping variables, it performs
> the function on all of the values in all columns of the matrix
> corresponding to those rows). Thus, summarize is not suitable for what I
> want to do, without needless trickery.
Wim - This posting to s-news was premature as I just finished sending you a
private reply to the private message you also sent. summarize works fine with
only one statistic, but you may need to update your version of summarize and a
function it calls, mApply. See the note I just sent. -FH
>
> I realize that aggregate uses tapply which uses loops, but geez.... several
> hours at least, compared to under one second?
>
> Is there an alternative that does not loop?
>
> Thanks...Wim
> ======================
> Dr. Wim Kimmerer
> Romberg Tiburon Center
> San Francisco State University
> 3152 Paradise Drive
> Tiburon CA 94920
> Ph. (415) 338-3515
> Fax (415) 435-7120
> http://online.sfsu.edu/~kimmerer/
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
---
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
|