On Wed, 3 Mar 2004 08:48:52 +0100
Martin Maechler <maechler@stat.math.ethz.ch> wrote:
> >>>>> "BertG" == Gunter, Bert <bert_gunter@merck.com>
> >>>>> on Tue, 2 Mar 2004 08:57:01 -0500 writes:
>
> BertG> As Andy said, lmList will probably do what you want,
> BertG> but as it uses lm(), it may also take a while. If you
> BertG> wish to do it "by hand" yourself, try by() [which is
> BertG> a wrapper for tapply()] and use lsfit instead of
> BertG> lm. As others have said, lm may incur a lot of
> BertG> overhead. The code would be something like
>
> .................
>
> I think it should be lm.fit() , not lsfit().
> This will use the numerical engine below lm() without
> the overhead of formula -> model.frame -> model.matrix.
>
> Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
A slightly faster approach may be to use the following function which is
part of the Hmisc library. The S-Plus version is shown below. x is a
matrix, y a vector.
lm.fit.qr.bare <- function(x, y,
tolerance = .Machine$single.eps,
intercept=TRUE, xpxi=FALSE) {
if(intercept) x <- cbind(1,x)
if(storage.mode(x) != "double") storage.mode(x) <- "double"
if(storage.mode(y) != "double") storage.mode(y) <- "double"
dx <- dim(x)
dn <- dimnames(x)
qty <- y
n <- dx[1]
n1 <- 1:n
p <- dx[2]
p1 <- 1:p
dy <- c(n, 1)
z <- .Fortran("dqrls",
qr = x,
as.integer(dx),
pivot = as.integer(p1),
qraux = double(p),
y,
as.integer(dy),
coef = double(p),
residuals = y,
qt = qty,
tol = as.double(tolerance),
double(2 * p),
rank = as.integer(p))
coef <- z$coef
if(length(dn[[2]])) names(coef) <- dn[[2]]
res <- z$residuals
sse <- sum(res^2)
sst <- sum((y-mean(y))^2)
res <- list(coefficients=coef, residuals=res,
rsquared=1-sse/sst, fitted.values=y-res)
if(xpxi) {
R <- (z$qr)[p1, , drop = FALSE]
R[lower.tri(R)] <- 0
rinv <- solve(R, diag(length(coef)))
xpxi <- rinv %*% t(rinv)
res$xpxi <- xpxi
}
res
}
I use this for bootstrapping, etc.
Frank
|