Hello Bill,
thanks for looking into that problem. You said you changed the Splus code,
in "summary.glm" I assume, to "sqrt(x)" rather than "x^0.5". Unfortunately
it doesn't solve the problem, takes just as long to run on the Linux 64-bit
version of Splus!
> ttt.test <- sample(c(0.99999999999999978, 0.99999999999999978,
+ 1.0000000000000002, 1.0000000000000002,
+ 1.0000000000000002, 1.0000000000000002,
+ 0.99999999999999978, 1.0000000000000002,
+ 0.99999999999999978, 0.99999999999999978), size =
826015,
+ replace = T)
> class(ttt.test)
[1] "numeric"
> mysummary(ttt.test)
Min. 1st Qu. Median Mean 3rd Qu. Max. N Sum
1 1 1 1 1 1 826015 826015
> resources(ttt.test.sqrt <- sqrt(ttt.test))
User time = 0 h. 12 min. 39.33 s.
System time = 0 h. 0 min. 0.06 s.
CPU time = 0 h. 12 min. 39.39 s.
Elapsed time = 0 h. 12 min. 40.24 s.
Child = 0 h. 0 min. 0.00 s.
% CPU = 99.89
Memory usage:
Cache = 0 Bytes
Working = 13.217072M Bytes
I find this annoying since I extensively use summary.glm; I suppose I'll
make a private copy and check for closeness to "1" before taking sqrt.
Gérald Jean
Conseiller senior en statistiques, Actuariat
télephone : (418) 835-4900 poste (7639)
télecopieur : (418) 835-6657
courrier électronique: gerald.jean@dgag.ca
"In God we trust, all others must bring data" W. Edwards Deming
Bill Dunlap
<bill@insightful.
com> A
Robert A LaBudde <ral@lcfltd.com>
2008/04/07 20:09 cc
gerald.jean@dgag.ca,
s-news@wubios.wustl.edu,
support@insightful.com
Objet
Re: [S] Square root of a vector
taking for ever!
On Mon, 7 Apr 2008, Robert A LaBudde wrote:
> I'd say there is a problem with an iterative
> algorithm for a fractional power of a number near
> one. I ran your problem in R v.2.6.2 and got:
> ...
> > system.time(ttt.test.sqrt <- ttt.test^0.5)
> user system elapsed
> 0.06 0.01 0.07
>
> I suppose Insightful should look into their algorithm, if it takes 13
minutes.
>
> I would recommend you use "sqrt()" instead of
> "^0.5" as a workaround. It's usually better to do this anyway.
Splus uses that standard math library's pow() function. We do not
write our own code for things like that, although we do have to write
wrappers to work around known problems in the math library's functions
(e.g., they do know about our missing value flags and don't always do
the right things for Inf).
The root cause is that the 64-bit gcc 3.4.5 version of pow(x,y) is
extremely slow when x is 1 bit away from 1 in either direction (1+2^-52
or 1-2^-53) and y is quite close or equal to 0.5. Our tests did not
check those cases.
I just changed the C code in Splus to use sqrt(x) instead of
pow(x,0.5). Later I looked at R's C code (src/main/arithmetic.c) and
noticed that it does the same thing unless one is using gcc 4.3:
/* work around a bug in May 2007 snapshots of gcc pre-4.3.0 */
#if __GNUC__ == 4 && __GNUC_MINOR__ == 3
return (y == 2.0) ? x*x : pow(x, y);
#else
return (y == 2.0) ? x*x : ((y == 0.5) ? sqrt(x) : pow(x, y));
#endif
I don't know if it does this to avoid a speed problem or for some other
reason. The checkin comment says
"work around an optimization bug in currnt gcc pre-4.3.0"
You can still detect the problem in R (compiled with gcc 3.4.5
in 64-bit mode on Linux) if you use a power quite close to 0.5:
R64> unix.time(rep(1+2^-52,1000) ^ 0.5) # uses C sqrt()
user system elapsed
0 0 0
R64> unix.time(rep(1+2^-52,1000) ^ (0.5+2^-46)) # uses C pow()
user system elapsed
2.177 0.000 2.177
> >Allmost 13 minutes to take 826015 square roots of numbers near 1?
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146
"All statements in this message represent the opinions of the author and
do
not necessarily reflect Insightful Corporation policy or position."
Le message ci-dessus, ainsi que les documents l'accompagnant, sont destinés
uniquement aux personnes identifiées et peuvent contenir des informations
privilégiées, confidentielles ou ne pouvant être divulguées. Si vous avez
reçu ce message par erreur, veuillez le détruire.
This communication (and/or the attachments) is intended for named
recipients only and may contain privileged or confidential information
which is not to be disclosed. If you received this communication by mistake
please destroy all copies.
|