s-news
[Top] [All Lists]

Re: [S] on vcov()

To: Stephanie MAHEVAS <Stephanie.Mahevas@ifremer.fr>
Subject: Re: [S] on vcov()
From: Bill Venables <William.Venables@cmis.CSIRO.AU>
Date: Tue, 25 Jan 2000 00:36:13 +1000
Cc: s-news@wubios.wustl.edu
In-reply-to: Your message of "Mon, 24 Jan 2000 10:41:04 +0100." <388C1E30.4E3E0A54@ifremer.fr>
Sender: owner-s-news@wubios.wustl.edu
> 
> Dear s-plus users,
> 
> I do a simple linear model with n factors and each factor has
> respectively
> N(i) levels (i=3D1,..,n). The contrasts for fitting are default ones. To
> get the estimators' variance (to construct confidence intervals of each
> estimator' coefficient) I use the fonction vcov(lm.out) of
> library(MASS).
> The size of the variance-covariance matrix is
> [(1+(N(1)-1)+...+(N(n)-1))]*[(1+(N(1)-1)+...+(N(n)-1))] and not
> [(1+N(1)+...+N(n))]*[(1+N(1)+...+N(n))] as I would like ...!
> 
> WHY and anyone know how to get the variance of each coefficient
> (associated to each level factor)?
> 
> 
> thanks in advance.
> Stephanie

Dear Stephanie,

As Professor Ripley has already explained, the variance matrix
returned by vcov for the coefficient vector estimated in the
linear model.  For factor models these are a reduced set
corresponding to a (generally) non-redundant set of columns in
the model matrix.

If X (n x p) is a complete set of binary columns corresponding to
a p-level factor term (main effect) in a single classification
linear model, the non-redundant set of columns has the form XC,
where C (p x (p-1)) is the so-called "contrast matrix" for the
factor (and can be set in various ways).

Hence if b ((p-1) x 1) is the reduced set of coefficients for the
factor the complete set is g = Cb.  This relation shows how to
get the variance matrix for the full set as well.  For just one
main-effect term the computation would go as follows:

library(MASS)
fm <- lm(y ~ f, data)
b0 <- coef(fm)  
Vb0 <- vcov(fm)

C <- contrasts(data$f)
H <- rbind(c(1, rep(0, dim(C)[2])), cbind(0, C))
g0 <- H %*% b0
Vg0 <- H %*% V %*% t(H)

At this point g0 is the expanded coefficient vector and Vg0 is
the (1 + p) x (1 + p) variance matrix you were expecting.

You can see that if you have more than one term the calculation
is going to get rather tedious (mainly in calculating the matrix
I call H above), though it is easy enough to see how to do it, so
long as you only have main-effect terms.  If you have interaction
terms things get tricky as well as tedious.

There is a function available, dummy.coef() that will get you the
redundant coefficient vector (g0 above) but it will not get you
its variance matrix.  Writing a fully general function to produce
this expanded variance matrix is a programming exercise that
forces you to come to grips with the mysteries of contrast
matrices like nothing else.  The problem is that the calculation
is very tedious to program and in the end the expanded variance
matrix is rarely of any genuine statistical interest.

Bill Venables.

-- 
-----------------------------------------------------------------
Bill Venables, Statistician, CMIS Environmetrics Project.

Physical address:                            Postal address:
CSIRO Marine Laboratories,                   PO Box 120,       
233 Middle St, Cleveland, Qld, 4163          Cleveland, Qld, 4163
AUSTRALIA                                    AUSTRALIA

Tel: +61 7 3826 7251           Email: Bill.Venables@cmis.csiro.au     
Fax: +61 7 3826 7304      http://www.cmis.csiro.au/bill.venables/
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>