| To: | 'Pravin' <jadhavpr@vcu.edu>, s-news@wubios.wustl.edu |
|---|---|
| Subject: | Re: S-PLUS Vs some other softwares |
| From: | "Gunter, Bert" <bert_gunter@merck.com> |
| Date: | Tue, 2 Mar 2004 08:57:01 -0500 |
|
As
Andy said, lmList will probably do what you want, but as it uses lm(), it may
also take a while. If you wish to do it "by hand" yourself, try by() [which is
a
wrapper for tapply()] and use lsfit instead of lm. As others have
said, lm may incur a lot of overhead. The code would be something
like
results<- by(the.data.frame,
the.data.frame$patientID,function(z)coef(lsfit(z$x,z$y))[2]
)
where
x and y are the x and y values for each patient (note they are in the reverse
order of lm() syntax, which would be lm(y~x)).
results is an object of class "by" -- essentially a list of slopes
of length the number of different patient id's + some additional attribute
info.
You can probably ignore this.
Let
us
know how this all works out for you. Specifically, did any of the suggestions
give you an answer in a "reasonable" amount of time, where you define
"reasonable." Large data sets **CAN** be a problem, but 500,000 x8 with
little covariate info doesn't sound all that large really, especially
these
days.
Cheers,
Bert Gunter "The business of the statistician is to catalyze the scientific learning process." -- George E.P. Box -----Original Message-----
From: s-news-owner@lists.biostat.wustl.edu [mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Pravin Sent: Monday, March 01, 2004 7:39 PM To: s-news@wubios.wustl.edu Subject: [S] S-PLUS Vs some other softwares Hi all, (Almost)Always I have written S-PLUS code where for() loop looked indispensable to ME. Since it did my job at the expense of slightly more dos time, I never looked at the alternatives. But, this time I have a very simple problem and I thought for() loop should be able to do the job. But it didn't! I am doing one permutation experiment that requires me to analyze data from 500,000 patients(9 samples per patient) and all I want to fit is the linear regression model and extract the estimates of slope on each patient. After running my computer for 16 hrs (CPU usage looked like it was computing all the time), S-PLUS reached patient number 19,000......Is there any quicker way of doing this in S-PLUS? Or from what I always hear---- S-PLUS is limited by its ability to handle huge datasets at hand, do I have to look for some other software that can do this huge computational task really quickly? Any recommendations? LOOP: Pravin Pravin Jadhav ------------------------------------------------------------------------------ |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: All combinations of dimensions, David L Lorenz |
|---|---|
| Next by Date: | Summary: How to find the dates?, Peng Huang |
| Previous by Thread: | Re: S-PLUS Vs some other softwares, Liaw, Andy |
| Next by Thread: | Re: S-PLUS Vs some other softwares, Martin Maechler |
| Indexes: | [Date] [Thread] [Top] [All Lists] |