After playing around with it for a while, the performance problem seems to be in
5.1 with including a function definition in the tapply call. Here are the two
tests that I ran:
SPLUS 5.1
============
> version
Version 5.1 Release 1 for Sun SPARC, SunOS 5.5 : 1999
> unix.time(a.2 <- tapply(x.1, x.2, sum))
[1] 1.31 0.02 2.00 0.00 0.00
#
# now make an 'indirect' call to the 'sum' function
#
> unix.time(a.2 <- tapply(x.1, x.2, function(x)sum(x)))
[1] 6.10 0.03 7.00 0.00 0.00
These should give the same result, but the last one takes 4 times as long (6.1
vs. 1.3 seconds). Here is the same run with 3.4:
SPLUS 3.4
===========
> version
Version 3.4 Release 1 for Sun SPARC, SunOS 5.3 : 1996
> unix.time(a.2 <- tapply(x.1, x.2, sum))
[1] 0.45 0.02 0.00 0.00 0.00
#
# now make an 'indirect' call to the 'sum' function
#
> unix.time(a.2 <- tapply(x.1, x.2, function(x)sum(x)))
[1] 0.6 0.0 0.0 0.0 0.0
These seem to be closer in time (definitely not a factor of 4). SPLUS 3.4 is
also about 3 times faster in doing the 'tapply' function.
--
NOTICE: The information contained in this electronic mail transmission is
intended by Convergys Corporation for the use of the named individual or entity
to which it is directed and may contain information that is privileged or
otherwise confidential. If you have received this electronic mail transmission
in error, please delete it from your system without copying or forwarding it,
and notify the sender of the error by reply email or by telephone (collect), so
that the sender's address records can be corrected.
|