Actually there is no need for S+FinMetrics module even for speed efficient
code:
> junk <- matrix(1:(5000 * 100), ncol = 100)
> dos.time(x <- matrix(unlist(lapply(lapply(split(junk, rep(1:100, rep(50,
100))),
matrix, ncol = 100), colMeans)), ncol = 100, byrow = T))
[1] 0.641
Just to check the result:
> dos.time(x1 <- aggregate(junk, by = rep(1:100, rep(50, 100)), mean)[, -1])
[1] 10.531
> all(x == x1)
[1] T
PS: code using roll() is cleaner
===================================================================
Vadim Kutsyy, PhD http://www.kutsyy.com
vkutsyy@cytokinetics.com vadim@kutsyy.com
Statistician tel. 650-624-3218
Cytokinetics, Inc fax. 650-624-3010
280 East Grand Avenue
South San Francisco, CA 94080 http://www.cytokinetics.com
===================================================================
> -----Original Message-----
> From: Jeffrey Wang [mailto:jwang@insightful.com]
> Sent: Thursday, June 20, 2002 10:51 AM
> To: Shin, David; S-news@wubios.wustl.edu
> Subject: Re: [S] about aggregate
>
>
> Here's the code to do it using aggregate():
>
> > junk = matrix(1:(5000*100), ncol=100)
> > f = rep(paste("f", 1:100, sep=""), rep(50,100))
> > f = ordered(f, levels=paste("f", 1:100, sep=""))
> > dos.time(tmp <<- aggregate(junk, by=f, FUN=mean))
> [1] 12.918
>
> Here's the code to do it using the generic roll()
> function in the new S+FinMetrics module:
>
> > tmp.fun = function(data) {
> > list(colMeans(data))
> > }
> > dos.time(tmp2 <<- roll(tmp.fun, junk, 50, incr=50, trace=F)[[1]])
> [1] 0.922
>
> The second approach utilizes the colMeans() function,
> and thus is much faster.
>
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Jeffrey Wang Research Scientist
> Insightful Corp. (206) 802-2269
>
>
>
>
>
> -----Original Message-----
> From: Shin, David [mailto:David_Shin@ctb.com]
> Sent: Thursday, June 20, 2002 10:03 AM
> To: S-news@wubios.wustl.edu
> Subject: [S] about aggregate
>
>
> Hi,
>
> I am a new user of this list. If I say something wrong,
> please correct and
> pardon me. Thanks.
>
> I have a question about "aggregate".
> I have a data matrix that is 5000 (rows) x 100(columns).
> I want s-plus to compute the means for every 50 rows so that
> I can get a new
> matrix that is 100 x100.
>
> I try this command: aggregate( my.data, ndeltat = 50, fun =
> mean) , but it
> doesn't work.
> If someone knows how to do this, would you please tell me.
> Thank you very much.
>
> David
>
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
>
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu. To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message: unsubscribe s-news
>
|