s-news
[Top] [All Lists]

Re: [S] How do you create a formula for a model with a large number of v

To: mdradmac@helix.nih.gov (Michael Radmacher)
Subject: Re: [S] How do you create a formula for a model with a large number of variables?
From: Edward Malthouse <ecm@casbah.acns.nwu.edu>
Date: Wed, 23 Jun 1999 16:55:19 -0500 (CDT)
Cc: s-news@wubios.wustl.edu
In-reply-to: <004d01bebdbc$6c58c1d0$b5c8e780@fed-radmachm.cit.nih.gov> from "Michael Radmacher" at Jun 23, 99 05:07:37 pm
Sender: owner-s-news@wubios.wustl.edu
> I?m trying to do a stepwise logistic regression using the step.glm function.
> My question is about how to create the formula for my model.  I have a large
> set of variables to work with (over 100) and want to do a step-forward
> regression which starts with a null model, adding one variable at a time
> until it attains the best fit.
> 
> The problem is, as input to step.glm, I must define the scope of models for
> the stepwise search.  The formula for the lower bound is easy (it?s RESPONSE
> ~ 1) but the upper bound should contain all of the more than 100 potential
> regressor variables.  My question is, how can this be done in Splus without
> having to explicitly list every single one of the variables in the formula
> (i.e., RESPONSE ~ X1 + X2 + ? + X100 + ?)?
> 
> I?ve tried doing this by using a matrix, X, where each column of the matrix
> contains one of the regressor variables and then using the formula RESPONSE
> ~ X.  The problem with this is that X is considered a single factor and in
> the stepwise regression, only the entire matrix is considered for addition,
> not individual columns of the matrix.
> 
> I think there must be a simple solution to this problem, but haven't had any
> luck looking through the manuals I have. Any help you can give would be
> greatly appreciated.
> 
> Thanks,
> Michael Radmacher
> 
> 
> -----------------------------------------------------------------------
> This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
> message:  unsubscribe s-news
> 

When I have to analyze a data set in Splus with many variables, I
create a text file with my formula and source it in.  Utility
programs such as awk and perl are indispensable when it comes to
creating such a file.  For example, suppose the file "formula.s"
contains

myform <- y~x1+x2+x3+x4+x5+x6+x7+x8+x9+x10

Then I type
> source("formula.s")
> fit <- lm(myform, data=mydata)

Ed Malthouse

Dr. Edward C. Malthouse
Assistant Professor
Integrated Marketing Communications Department
Medill School of Journalism
1908 Sheridan Road
Evanston, IL  60208-1290
Tele:  847-467-3376
Fax:  847-491-5925
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>