Kim,
In the old days, before colSums() existed, and when lapply()
was substantially slower than it is now, I would use this:
rep(1, nrow(dfr)) %*% (dfr < 0)
which gives the number of negative values in each column.
Matrix multiplication can be a great tool in vectorizing
things. I *thought* this still would be the most efficient
of all the options (and was therefore a little surprised
by Bill Dunlap's response). I was wrong. Here's a wee test in
S-PLUS 4.5 under Win NT4, Pentium 300, 128MB:
> junk _ as.data.frame(matrix(rnorm(50000), 500, 100))
> dos.time(unlist(lapply(junk, function(x) sum(x < 0))))
[1] 0.06054688
> dos.time(apply(junk, 2, function(x) sum(x < 0)))
[1] 2.639648
> dos.time(colSums(junk < 0))
[1] 0.1201172
> dos.time(rep(1, 500) %*% (junk < 0))
[1] 0.1503906
And here:
> junk _ as.data.frame(matrix(rnorm(50000), 100, 500))
> dos.time(unlist(lapply(junk, function(x) sum(x < 0))))
[1] 0.21875
> dos.time(apply(junk, 2, function(x) sum(x < 0)))
[1] 13.62891
> dos.time(colSums(junk < 0))
[1] 0.3300781
> dos.time(rep(1, 100) %*% (junk < 0))
[1] 0.359375
Live and learn. Listen to Bill.
Cheers,
Scott
Cereon Genomics
Scott.Chasalow@cereon.com
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|