s-news
[Top] [All Lists]

[no subject]

If your data is really a data frame (not a matrix) you can speed
this up a bit with
        test <- unlist(lapply(df, function(dfi)all(dfi<0)))
However, the major speedup in this case comes from lapply
treating df as a list and using the equivalent of [[i]] instead
of [,i] on it.  (The subscripting methods for data frames are
rather slow, those for lists are fast.)  Thus you can squeeze
out some of the time by avoiding some of the overhead of lapply
and doing just

f3 <- function(df) {
        test <- logical(length(df))
        # names(test) <- names(df) # not needed, but handy
        df <- as.list(df)
        for(i in 1:length(df)) {
                test[i] <- all(df[[i]] < 0)
        }
        test
}

Lets call your original method f0:

f0 <- function(df) {
        test <- logical(ncol(df))
        for (i in 1:ncol(df))
        {
                test[i] <- all(df[,i] < 0)
        }
        test
}

and the lapply method f4:

f4 <- function(df) unlist(lapply(df, function(dfi) all(dfi < 0)))

For a 10 row by 100 column data I get:

> unix.time(f0(df))
[1] 1.279999 0.000000 2.000000 0.000000 0.000000
> unix.time(f3(df))
[1] 0.1300011 0.0000000 0.0000000 0.0000000 0.0000000
> unix.time(f4(df))
[1] 0.2700005 0.0000000 1.0000000 0.0000000 0.0000000

If we uncomment the line that assigns the names to the output
in f3 the time goes up to that of f4.  

The relative timings will change, depending on the dimensions
of df.

----------------------------------------------------------------------------
Bill Dunlap                                      22461 Mt Vernon-Big Lake Rd
Data Analysis Products Div. of MathSoft, Inc.    Mount Vernon, WA 98274
bill@statsci.com                                 360-428-8146
************************************************************


<Prev in Thread] Current Thread [Next in Thread>
  • [no subject], Unknown <=