s-news
[Top] [All Lists]

Re: speed up for() loop in function

To: Jean V Adams <jvadams@usgs.gov>
Subject: Re: speed up for() loop in function
From: Sundar Dorai-Raj <sundar.dorai-raj@PDF.COM>
Date: Sat, 15 May 2004 08:10:06 -0700
Cc: s-news@wubios.wustl.edu
In-reply-to: <OF8166BACD.6E1C4081-ON85256E95.004E4BB4@er.usgs.gov>
Organization: PDF Solutions, Inc.
References: <OF8166BACD.6E1C4081-ON85256E95.004E4BB4@er.usgs.gov>
Reply-to: sundar.dorai-raj@PDF.COM
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)


Jean V Adams wrote:
I'm writing a function to carry out a simulation, and I'd like to try and
speed it up.  Most parts of the function run fast (random number
generation, arithmetic operations, all array operations, no loops) until I
get near the end, where I conduct about 100,000 t-tests inside of a for()
loop.  Here's a simplified example with two three-dimensional arrays
(4x3x2) and one t-test in a for() loop.

# data
dims <- c(4, 3, 2)
a <- array(rnorm(prod(dims)), dim=dims)
b <- array(rnorm(prod(dims)), dim=dims)

# for loop
m <- array(NA, dim=dims[1:2])
for(i in 1:dims[1]) {
for(j in 1:dims[2]) {
      m[i, j] <- t.test(a[i, j, ], b[i, j, ])$p.value<0.05
      }}

In reality, I have six three-dimensional arrays (1000x20x50) and six
t-tests in a for() loop.  It took me 3.5 minutes to run the above example
setting dims to c(1000, 20, 50).  I would appreciate any hints as to how to
speed this up.

I'm using S-PLUS® 6.2 for Windows PROFESSIONAL EDITION with XP operating
system.

JVA

You can build the t-test directly from the means & variance using rowMeans and rowVars:

set.seed(1)
dims <- c(100, 3, 2)
a <- array(rnorm(prod(dims)), dim=dims)
b <- array(rnorm(prod(dims)), dim=dims)

# for loop
t1 <- proc.time()
mt <- mp <- array(NA, dim=dims[1:2])
for(i in 1:dims[1]) {
  for(j in 1:dims[2]) {
    tt <- t.test(a[i, j, ], b[i, j, ])
    mt[i, j] <- tt$statistic
    mp[i, j] <- tt$p.value
  }
}
t2 <- proc.time()

t3 <- proc.time()
ma <- rowMeans(a, dims = 2)
mb <- rowMeans(b, dims = 2)
ssa <- rowVars(a, SumSquares = TRUE, dims = 2)
ssb <- rowVars(b, SumSquares = TRUE, dims = 2)
s <- sqrt((ssa + ssb)/(2 * dims[3] - 2))
tarr <- (ma - mb)/s
ptarr <- 2 * (1 - pt(abs(tarr), 2 * dims[3] - 2))
ptarr <- array(ptarr, dim = dims[1:2])
t4 <- proc.time()


<Prev in Thread] Current Thread [Next in Thread>