Rich@Mango-Solutions.com wrote:
You could use sapply to pass in a vector of numbers (for column numbers) or
characters (for column names) and index the columns of the input data.
Something like this:-
innerFun <- function(i, df) {
if (i == 1) mean(df[[i]])
else median(df[[i]])
}
sapply(1:length(myDf), innerFun, df=myDf)
Not as efficient as apply for larger data structures I'd imagine, since
you're passing the entire dataset in each time ...
Rich.
Actually, S-PLUS usually avoids make unnecessary copies of datasets in
this type of situation. (AFAIK, S-PLUS uses techniques along the lines
of only duplicating an object when there is a change to an object, so
local "copies" to unchanged versions can just be references).
Here's some operations on a large dataset with measurements that show
that extra copies are not made:
> x <-
matrix(rnorm(1e6),ncol=1000,dimnames=list(paste("r",1:1000,sep=""),paste("c",1:1000,sep="")))
>
> # ordinary apply() version
> mem.tally.reset()
> print(sys.time({r1 <- apply(x, 1, sum); print(mem.tally.report())}))
new database evaluation
0 8097114
[1] 0.250 1.063
> mem.tally.reset()
> print(sys.time({r1 <- apply(x, 1, sum); print(mem.tally.report())}))
new database evaluation
0 8097114
[1] 0.265 1.062
> all(r1==r2)
[1] T
>
> # version that uses sapply() to loop over indices and access global
> # variable x
> mem.tally.reset()
> print(sys.time({r2 <- sapply(seq(nrow(x)), function(i) sum(x[i,]));
print(mem.tally.report())}))
new database evaluation
0 8879045
[1] 0.625 1.454
> mem.tally.reset()
> print(sys.time({r2 <- sapply(seq(nrow(x)), function(i) sum(x[i,]));
print(mem.tally.report())}))
new database evaluation
0 8879045
[1] 0.797 1.640
>
> # version that uses sapply() to loop over indices and also
> # passes in matrix x
> mem.tally.reset()
> print(sys.time({r3 <- sapply(seq(nrow(x)), function(i,y)
sum(y[i,]),x); print(mem.tally.report())}))
new database evaluation
0 8935493
[1] 0.891 1.672
> mem.tally.reset()
> print(sys.time({r3 <- sapply(seq(nrow(x)), function(i,y)
sum(y[i,]),x); print(mem.tally.report())}))
new database evaluation
0 8939525
[1] 0.891 1.703
> all(r1==r3)
[1] T
>
So, the both sapply() versions use about 1.1 times the memory used by
the apply() version, and takes about 2.5 to 3 times the CPU. The latter
makes some sense because apply() in S-PLUS uses an optimized C function
for iterations when its FUN returns vectors of identical sizes.
This was done with S-PLUS 7.0.6 for windows, and I've observed similar
behavior in all versions from S-PLUS 6 onwards. One thing to note is
that loops done using sapply() (or lapply()) can be much more memory
efficient than loops done with 'for' -- sometimes unneeded memory used
in iterations of 'for' loops is not effectively reclaimed at the end of
each iteration, leading to out-of-memory errors.
-- Tony Plate
-----Original Message-----
From: s-news-owner@lists.biostat.wustl.edu
[mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Schwarz,Paul
Sent: 11 February 2006 02:19
To: s-news@lists.biostat.wustl.edu
Subject: [S] identifying which column/row is passed in calls to apply()
S-News readers,
How would I identify which column (or row) is being operated on in a
call to the function specified in apply()? That is, if I wanted to
perform some column-specific (or row-specific) operation, how do I
identify which column/row is being passed to the function that's
specified in the call to apply(), where the number of columns ranges
from 1 to ncol(mymat)?
apply( mymat, 2, function(x){ ? } )
I suspect that this is easy, but I'm drawing a blank on how to do this.
Thanks a lot everyone.
-Paul
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
|