s-news
[Top] [All Lists]

Re: Square root of a vector taking for ever!

To: gerald.jean@dgag.ca
Subject: Re: Square root of a vector taking for ever!
From: Robert A LaBudde <ral@lcfltd.com>
Date: Mon, 07 Apr 2008 20:40:56 -0400
Cc: s-news@wubios.wustl.edu,support@insightful.com
I'd say there is a problem with an iterative algorithm for a fractional power of a number near one. I ran your problem in R v.2.6.2 and got:

> ttt.test <- sample(c(0.99999999999999978, 0.99999999999999978,
+                      1.0000000000000002, 1.0000000000000002,
+                      1.0000000000000002, 1.0000000000000002,
+                      0.99999999999999978, 1.0000000000000002,
+                      0.99999999999999978, 0.99999999999999978), size =
+ 826015,
+                    replace = T)
> class(ttt.test)
[1] "numeric"
> length(ttt.test)
[1] 826015
> system.time(ttt.test.sqrt <- ttt.test^0.5)
   user  system elapsed
   0.06    0.01    0.07

I suppose Insightful should look into their algorithm, if it takes 13 minutes.

I would recommend you use "sqrt()" instead of "^0.5" as a workaround. It's usually better to do this anyway.


At 01:12 PM 4/7/2008, gerald.jean@dgag.ca wrote:
Hello there,

I am running :

S-PLUS : Copyright (c) 1988, 2007 Insightful Corp.
S : Copyright Insightful Corp.
Version 8.0.4  for Linux 2.4.21-37.EL, 64-bit : 2007

Running "summary.glm" twice on the output of "dglm" (a function by Gordon
Smyth for dispersion modeling) takes roughly 20 minutes??  I made a private
version of "summary.glm" trying to track down where the bottle neck is.
Here is what I found:

the instruction: "wt <- wt^0.5" is the one that's eating up all the time.
I then inserted a "return(wt)" statement in the function and played around
with this "named" vector.  I "unnamed" the vector but the timing was
basically the same.  I tracked the problem down to taking square root of
numbers very close to "1".  To make a long story short I'll show you the
same command ran on the above machine, Linux 64-bit machine, and on a
32-bit Windows version:

--------------------------------------------------------------------------------
ttt.test <- sample(c(0.99999999999999978, 0.99999999999999978,
                     1.0000000000000002, 1.0000000000000002,
                     1.0000000000000002, 1.0000000000000002,
                     0.99999999999999978, 1.0000000000000002,
                     0.99999999999999978, 0.99999999999999978), size =
826015,
                   replace = T)
class(ttt.test)
[1] "numeric"
mysummary(ttt.test)
                   Min.                1st Qu.                 Median
 9.9999999999999985e-03 9.9999999999999985e-03 9.9999999999999985e-03
                   Mean                3rd Qu.                   Max.
 1.0000000000000000e-02 1.0000000000000002e-02 1.0000000000000002e-02
                      N                    Sum
 8.2601500000000000e+05 8.2601500000000000e+05
length(ttt.test)
[1] 826015
resources(ttt.test.sqrt <- ttt.test^0.5)
 User time    =  0 h.  12 min.  39.060000000000172804 s.
 System time  =  0 h.   0 min.   0.019999999999999574 s.
 CPU time     =  0 h.  12 min.  39.080000000000154614 s.
 Elapsed time =  0 h.  12 min.  39.860000000000582077 s.
 Child        =  0 h.   0 min.   0.000000000000000000 s.
 % CPU        =  99.9
 Memory usage:
      Cache   = 0 Bytes
      Working = 13.217104M Bytes
--------------------------------------------------------------------------------
Allmost  13 minutes to take 826015 square roots of numbers near 1?
Now on the 32-bit Windows machine.

S-PLUS : Copyright (c) 1988, 2007 Insightful Corp.
S : Copyright Insightful Corp.
Enterprise Developer Version 8.0.4  for Microsoft Windows : 2007

ttt.test <- sample(c(0.99999999999999978, 0.99999999999999978,
                     1.0000000000000002, 1.0000000000000002,
                     1.0000000000000002, 1.0000000000000002,
                     0.99999999999999978, 1.0000000000000002,
                     0.99999999999999978, 0.99999999999999978), size =
826015,
                   replace = T)
class(ttt.test)
[1] "numeric"
options(digits = 17)
mysummary(ttt.test)
                Min.             1st Qu.             Median Mean
3rd Qu.
 0.99999999999999978 0.99999999999999978 1.0000000000000002    1
1.0000000000000002
               Max.      N                     Sum
 1.0000000000000002 826015  8.2601500000000000e+05
length(ttt.test)
[1] 826015
resources(ttt.test.sqrt <- ttt.test^0.5)
 User time    =  0 h.  0 min.  0.260999999999999900 s.
 System time  =  0 h.  0 min.  0.010000000000000009 s.
 CPU time     =  0 h.  0 min.  0.270999999999999910 s.
 Elapsed time =  0 h.  0 min.  0.271000000000015010 s.
 Child        =  0 h.  0 min.  0.000000000000000000 s.
 % CPU        =  100
 Memory usage:
      Cache   = 0 Bytes
      Working = 13.232343M Bytes
--------------------------------------------------------------------------------
It is only

(12 * 60 + 39.860000000000582077) / 0.271000000000015010
[1] 2803.9114391142380 times faster???  I know that a problem that fits in

32-bit will run faster in 32-bit than in 64-bit BUT I doubt very much that
it
explains such a difference?  Am I missing something???

Thanks for any support,

Gérald Jean
Conseiller senior en statistiques, Actuariat
télephone            : (418) 835-4900 poste (7639)
télecopieur          : (418) 835-6657
courrier électronique: gerald.jean@dgag.ca

"In God we trust, all others must bring data"  W. Edwards Deming


Le message ci-dessus, ainsi que les documents l'accompagnant, sont destinés
uniquement aux personnes identifiées et peuvent contenir des informations
privilégiées, confidentielles ou ne pouvant être divulguées. Si vous avez
reçu ce message par erreur, veuillez le détruire.

This communication (and/or the attachments) is intended for named
recipients only and may contain privileged or confidential information
which is not to be disclosed. If you received this communication by mistake
please destroy all copies.


--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news

================================================================
Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral@lcfltd.com
Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
824 Timberlake Drive                     Tel: 757-467-0954
Virginia Beach, VA 23464-3239            Fax: 757-467-2947

"Vere scire est per causas scire"
================================================================


<Prev in Thread] Current Thread [Next in Thread>