Many thanks to Mr. Berry and Mr. Lancelot for sharing their wisdom with
me; I've got my job done with their very helpful tips:
On Wed, 20 Oct 1999 17:18:34 +0100, Renaud Lancelot
<lancelot@telecomplus.sn> wrote:
Bin Chen wrote:
>
> Hi. Is there a simple command that gives the power set of a vector
> (that is, all its subsets)? I'm trying to run 256 variants of a least
> squares model involving a maximum of 8 independent variables
> (including the intercept) and then compute their posterior odds. I
> figure if I can get all the subsets of the names of the X data.frame I
> would be able to do this quickly, using a loop in which
>
> subset.of.names <-- ?? names(X) ??
> lm(data.frame(y,X[,row(matrix(names(X)))[names(X)==subset.of.names]]))
Hi Bin,
I recently met the same problem and here is how I did:
Suppose we want to perform all subset models of:
lm(cost ~ age + type + car.age, data=na.omit(claims))
1. Write a subsets() function, found in VR 2nd ed. p. 149 or in the
S-news archives:
subsets <- function(n, k, set = 1:n)
if(k <= 0) NULL else
if(k >= n) set else
rbind(cbind(set[1], Recall(n - 1, k - 1, set[-1])),
Recall(n - 1, k, set[-1]))
2. Compute all the covariate combinations:
> f <- c("age", "type", "car.age")
> form <- NULL
> K <- 0
> for(i in 1:length(f)) {
m <- subsets(length(f), i)
l <- ifelse(i < length(f), dim(m)[1], 1)
for(j in 1:l) {
K <- K + 1
formu <- NULL
for(k in 1:i) {
if(i < length(f))
formu <- paste(formu, f[m[j, k]], sep =
ifelse(k == 1, "", "+"))
else formu <- paste(formu, f[k], sep = ifelse(k ==
1, "", "+"))
}
formu <- paste("cost ~ ", formu, sep = "")
form[K] <- formu
}
}
> rm(m, l, K, formu)
> form
[1] "cost ~ age" "cost ~ type" "cost ~
car.age" "cost ~ age+type"
[5] "cost ~ age+car.age" "cost ~ type+car.age" "cost ~
age+type+car.age"
3. Compute all the linear models:
> for(i in 1:length(form)) {
assign(paste("lmod.", i, sep = ""), try(lm(form[i], data =
na.omit(claims))), where = 1, immediate =
T)
}
> objects(pattern = "lmod.*")
[1] "lmod.1" "lmod.2" "lmod.3" "lmod.4" "lmod.5" "lmod.6" "lmod.7"
> tapply(objects(pattern = "lmod.*"), 1:length(objects(
pattern = "lmod.*")), function(x)
class(get(x)))
1 2 3 4 5 6 7
"lm" "lm" "lm" "lm" "lm" "lm" "lm"
> result <- tapply(objects(pattern = "lmod.*"), 1:length(
objects(pattern = "lmod.*")), function(x)
summary(get(x)))
> names(result) <- form
> result[3]
$"cost ~ car.age":
Call: lm(formula = form[i], data = na.omit(claims))
Residuals:
Min 1Q Median 3Q Max
-144.7 -49.47 -29.69 12.78 576.3
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 235.3179 10.3098 22.8247 0.0000
car.age.L -120.6810 20.8369 -5.7917 0.0000
car.age.Q -23.3411 20.6196 -1.1320 0.2599
car.age.C -0.3798 20.3999 -0.0186 0.9852
Residual standard error: 114.2 on 119 degrees of freedom
Multiple R-Squared: 0.224
F-statistic: 11.45 on 3 and 119 degrees of freedom, the p-value is
1.195e-006
Correlation of Coefficients:
(Intercept) car.age.L car.age.Q
car.age.L 0.0488
car.age.Q 0.0265 0.0420
car.age.C 0.0050 0.0159 0.0259
*********************
Careful:
1. assign() will overwrite any existing object with the same name as the
assigned name
2. If you plan to include interaction and / or some of your models are
saturated or lead to singular matrices, you will have errors and for()
will stop. That's the reason why I have embedded the fitting function
within a try() call. This function exists in S+ 2000 and is as
following:
try <- function(expr, first = T)
{
# evaluates expr (in frame of caller) and returns either
# its value or the error message (with class "Error").
# DO NOT USE THE 'first' ARGUMENT.
restart(first)
if(first) {
first <- F
expr
}
else {
error.msg <- get.message()
class(error.msg) <- "Error"
error.msg
}
}
In case of an error occurring within try(), an object of class "error"
is returned, which can be separated from regular fitted objects before
editing the summaries or doing whatever you want.
3. You can also decide to set options(warn=2) which turns any warning
(false convergence, etc.) into an error.
Hope this helps,
Renaud
--
Renaud Lancelot
ISRA-LNERV
BP 2057 Dakar-Hann
Senegal
tel (221) 832 49 02
fax (221) 821 18 79
email renaud.lancelot@cirad.fr
On Wed, 20 Oct 1999 09:48:09 -0700 (PDT), "Charles C. Berry"
<cberry@tajo.ucsd.edu> wrote:
Hmmm. This sounds more like a job for leaps(), but if lm() really is the
way to
go,
all.combos <- as.logical(
as.matrix(
do.call("expand.grid",rep(list(c(0,1)),8))
))
returns a matrix whose 256 rows give all possible subsets
then names(X)[all.combos[i,]] will return the names of the vraibles
corresponding to the subset for the i^{th} row.
>
>Hi. Is there a simple command that gives the power set of a vector
>(that is, all its subsets)? I'm trying to run 256 variants of a least
>squares model involving a maximum of 8 independent variables
>(including the intercept) and then compute their posterior odds. I
>figure if I can get all the subsets of the names of the X data.frame I
>would be able to do this quickly, using a loop in which
>
>subset.of.names <-- ?? names(X) ??
>lm(data.frame(y,X[,row(matrix(names(X)))[names(X)==subset.of.names]]))
>...
>
>Thanks a lot in advance.
>
>- Bin Chen
>-----------------------------------------------------------------------
>This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
>send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
>message: unsubscribe s-news
Charles C. Berry (858) 534-2098
Dept of Family/Preventive
Medicine
E mailto:cberry@tajo.ucsd.edu UC San Diego
http://hacuna.ucsd.edu/members/ccb.html La Jolla, San Diego 92093-0645
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|