There are almost always pros and cons with these issues. S's sum() is an
S4 generic whereas R's is internal *unless* you define an S4 method for
it (which S-PLUS has already done). S needs to create several frames for
what is a nested set of function calls -- 1280b looks modest for that.
Also, S has an ability to back out calculations that R does not, and that
costs memory (and can have benefits).
We know there are overheads in making functions generic, especially
S4-generic, but then there are benefits too. I am not sure designers who
add features take enough account of the costs.
Let us remind you that you appear not to be sufficiently conversant with
user-level issues to know to use rowSums: `scalability' is mainly a
function of user expertise, but prodigious amounts of RAM can help.
On Fri, 26 Mar 2004, Steve Karmesin wrote:
> Steve Karmesin wrote:
>
> > As others have said, what apply has to do in this case is loop over
> > the 900,000 cases and do a 'sum' over three elements each time. In
> > this case the overhead of calling an S+ function totally swamps the
> > numeric operations.
>
> A little more investigation (together with office mate Tony Plate)
> provides some insight.
>
> Using mem.tally.reset() and mem.tally.report() shows that for this case
> it is allocating a whopping 1280 bytes for each call to 'sum'.
>
> Just touching that much memory is going to be slow. So why would it do
> that? Looking at the definition of the apply function shows that it is
> allocating a general list for the result, not a vector-based array or
> matrix.
>
> Why? It has a shortcut that lets it use efficient matrices if the input
> is a 2D matrix, but this one is 3D, so it uses the general code, which
> is much, much slower and uses a lot more memory.
>
> If you collapse the first two dimensions of the array the times are
> stable at <80usec per call to sum and it allocates 8 bytes per call,
> which is just the amount of space needed.
>
> Still, the R code seems to always build a list, and it is about 15usec
> per call. Somehow the underlying function call and perhaps list storage
> mechanisms are more efficient there.
>
> For comparison, using rowSums has a the same 8 bytes per call required
> to store the result and about 0.1 usec per call, since the whole
> evaluation is then in C and the S-plus function 'sum' is never actually
> called.
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
|