s-news
[Top] [All Lists]

RE: [S] Yet Another Vectorization Question Answered!

To: "'Kim Elmore'" <elmore@nssl.noaa.gov>
Subject: RE: [S] Yet Another Vectorization Question Answered!
From: "CHASALOW, SCOTT [AG/2165]" <SCOTT.CHASALOW@cereon.com>
Date: Thu, 23 Sep 1999 16:49:24 -0500
Cc: "'S-NEWS'" <s-news@wubios.wustl.edu>
Sender: owner-s-news@wubios.wustl.edu
Kim,

In the old days, before colSums() existed, and when lapply()
was substantially slower than it is now, I would use this:

rep(1, nrow(dfr)) %*% (dfr < 0)

which gives the number of negative values in each column.

Matrix multiplication can be a great tool in vectorizing 
things.  I *thought* this still would be the most efficient
of all the options (and was therefore a little surprised
by Bill Dunlap's response).  I was wrong.  Here's a wee test in
S-PLUS 4.5 under Win NT4, Pentium 300, 128MB:

> junk _ as.data.frame(matrix(rnorm(50000), 500, 100))

> dos.time(unlist(lapply(junk, function(x) sum(x < 0))))
[1] 0.06054688

> dos.time(apply(junk, 2, function(x) sum(x < 0)))
[1] 2.639648

> dos.time(colSums(junk < 0))
[1] 0.1201172

> dos.time(rep(1, 500) %*% (junk < 0))
[1] 0.1503906
 
And here:

> junk _ as.data.frame(matrix(rnorm(50000), 100, 500))

> dos.time(unlist(lapply(junk, function(x) sum(x < 0))))
[1] 0.21875

> dos.time(apply(junk, 2, function(x) sum(x < 0)))
[1] 13.62891

> dos.time(colSums(junk < 0))
[1] 0.3300781

> dos.time(rep(1, 100) %*% (junk < 0))
[1] 0.359375

Live and learn.  Listen to Bill.

Cheers,
Scott

Cereon Genomics
Scott.Chasalow@cereon.com
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>