s-news
[Top] [All Lists]

Re: scalability

To: s-news@lists.biostat.wustl.edu
Subject: Re: scalability
From: Alan Zaslavsky <zaslavsk@hcp.med.harvard.edu>
Date: Mon, 29 Mar 2004 07:36:59 -0500 (EST)
> The problem up for comment is:
> 
> result <- apply(array.3D, 1:2, sum)
> 
> Where array.3D is 3000 by 300 by 3.

For this problem I wrote a little function many years ago called "dimsum"
(which you can think of as summing over arbitrary dimensions, or as a
Chinese breakfast, as you may prefer).  "keep" are the dimensions you keep,
specified in the usual way for subscripts, i.e. positive to keep, negative
values to specify by dropping, or a logical vector.  The remaining dimensions
are summed out.  (At the time I used my own "rowsum()" function which is
simpler than the one in recent versions of S-Plus, but you could also 
make this more efficient by directly incorporating the C call to 
"S_rowsum" that appears in rowsum().)  This is entirely vectorized although
presumably aperm could get slow with a very big array.


dimsum<-
function(data, keep)
{
    dims <- dim(data)
    ndims <- length(dims)
    keep <- (1:ndims)[keep]
    mat <- matrix(aperm(data, c((1:ndims)[ - keep], keep)),  , prod(dims[keep])
        )
    ar <- array(rowsum(mat,rep(1,nrow(mat))), dims[keep])
    dimnames(ar) <- dimnames(data)[keep]
    ar
}



<Prev in Thread] Current Thread [Next in Thread>