On Thu, 24 Feb 2000, W. Li wrote:
> may i ask a beginner's question:
>
> i want to do logistic regression over all predictors plus
> products of these predictors. both predictors and the
> outcome are kept in a table a[]. suppose i have two
> predictors a[,1] and a[,2], (a[,3] is the outcome)
> then i use
>
> a <- glm(a[,3] ~ a[,1]+a[,2]+a[,1]:a[,2], family=binomial)
>
> or alternatively
>
> a <- glm(a[,3] ~ (a[,1]+a[,2])^2, family =binomial)
>
> these are fine.
>
> but what if i have (say) 100 predictors. i thought
>
> a <- glm(a[,101] ~ a[,c(1:100)]^2, family=binomial)
>
> would do the trick. but it does not! a[,c(1:100)]^2 somehow
> remains as a[,c(1:100)]. what happens here? on the other hand, i
> don't want to type (a[,1]+a[,2] +..... a[,100])^2, the whole line.
What's a `table' here? You want to use a data frame. Suppose
your data are in data frame df, and column "resp" is the response.
Then you can use
fit <- glm(resp ~ .^2, family=binomial, data=df)
Now, a warning. To make sense of fitting the 5,000 or so regressors this
wil generate in a Bernouilli regression you will likely need at least
100,000 observations (more if successes or failures are rare) and you will
not succeed in doing that with glm in S-PLUS. (The design matrix is going
to measured in gigabytes.)
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|