My previous message is below. But I thought I should clarify it
somewhat.
The variables mod500 through mod200 are dummy variables. If I included
all of them in the model as well as the intercept, the design matrix
would not be full rank. So I can leave one of them out and include an
intercept or all five and leave out the intercept.
I fit the two models manually using this code:
Model1: with intercept
x.mat2<-cbind(rep(1,54),merc.design[,2:7],merc.design[,2]*merc.design[,3])
x.mat2<-as.matrix(x.mat2)
y<-log(merc.design[,1]); y<-as.vector(y)
B.hat2<-solve((t(x.mat2)%*%x.mat2))%*%t(x.mat2)%*%y
rss2<-t(y-x.mat2%*%B.hat2)%*%(y-x.mat2%*%B.hat2)
tss2<-t(y-mean(y))%*%(y-mean(y));
r2.2<-1-(rss2/tss2)
and get these results:
> B.hat2
[,1]
X1 8.7761388851 # intercept
date 0.0528854475
mileage -0.0093527444
mod500 0.7884192633
mod450 0.6990721297
mod380 0.6765803326
mod280 0.4149164225
X3 0.0007829935 # date:mileage interaction
and r2.2=[1,] 0.9394712
compared to the call to lm:
Value Std. Error t value Pr(>|t|)
(Intercept) 8.7761 0.1777 49.3779 0.0000
date 0.0529 0.0182 2.9031 0.0057
mileage -0.0094 0.0042 -2.2111 0.0320
mod500 0.7884 0.0437 18.0278 0.0000
mod450 0.6991 0.0616 11.3442 0.0000
mod380 0.6766 0.0402 16.8173 0.0000
mod280 0.4149 0.0375 11.0750 0.0000
date:mileage 0.0008 0.0005 1.6108 0.1141
Residual standard error: 0.09795 on 46 degrees of freedom
Multiple R-Squared: 0.9395
So everything seems fine. But for the next model
Model2: without intercept but with 5th dummy variable
x.mat<-cbind(merc.design[,2:8],merc.design[,2]*merc.design[,3])
x.mat<-as.matrix(x.mat)
y<-log(merc.design[,1]); y<-as.vector(y)
B.hat<-solve((t(x.mat)%*%x.mat))%*%t(x.mat)%*%y
rss<-t(y-x.mat%*%B.hat)%*%(y-x.mat%*%B.hat)
tss<-t(y-mean(y))%*%(y-mean(y));
r2<-1-(rss/tss)
> B.hat
[,1]
date 0.0528854475
mileage -0.0093527444
mod500 9.5645581485
mod450 9.4752110149
mod380 9.4527192177
mod280 9.1910553076
mod200 8.7761388852
X2 0.0007829935 # date:mileage interaction
and r2=0.9394712
and then compare the parameter estimates and r2's to the calls to lm
for model2.
Summary(model2)
Coefficients:
Value Std. Error t value Pr(>|t|)
date 0.0529 0.0182 2.9031 0.0057
mileage -0.0094 0.0042 -2.2111 0.0320
mod500 9.5646 0.1756 54.4779 0.0000
mod450 9.4752 0.1545 61.3233 0.0000
mod380 9.4527 0.1770 53.3995 0.0000
mod280 9.1911 0.1705 53.9107 0.0000
mod200 8.7761 0.1777 49.3779 0.0000
date:mileage 0.0008 0.0005 1.6108 0.1141
Residual standard error: 0.09795 on 46 degrees of freedom
Multiple R-Squared: 0.9999
So the parameter estimates are the same but the R-Squared is much higher
than the manual fit.
Brian
On Thu, 29 Apr 1999, Brian Beckage wrote:
>
> Hello,
>
> I noticed that when I fit the following two models I get
> different R^2's when (I think) they should be the same.
>
> model1<-lm(log(price)~date+mileage+date:mileage+mod450+mod380+
> mod280+mod200,data=merc.design)
>
> model2<-lm(log(price)~-1+date+mileage+date:mileage+mod500+mod450+mod380+
> mod280+mod200,data=merc.design)
>
> summary(model1) yields an R^2 of 0.9395 while summary(model2) yields an
> R^2 of 0.9999. I'm using Splus3.4.
>
> Thanks,
> Brian
>
> The data are:
>
> > merc.design[,1:8]
> price date mileage mod500 mod450 mod380 mod280 mod200
> 1 30495 11 7 1 0 0 0 0
> 2 22250 9 16 1 0 0 0 0
> 3 23995 9 8 1 0 0 0 0
> 4 18495 9 15 1 0 0 0 0
> 5 20950 10 26 1 0 0 0 0
> 6 21500 9 18 1 0 0 0 0
> 7 19995 7 24 1 0 0 0 0
> 8 18950 7 20 1 0 0 0 0
> 9 15695 6 13 0 1 0 0 0
> 10 15995 6 27 0 1 0 0 0
> 11 16595 6 18 0 1 0 0 0
> 12 15995 6 25 0 1 0 0 0
> 13 9950 1 43 0 1 0 0 0
> 14 17995 9 23 0 0 1 0 0
> 15 17495 8 16 0 0 1 0 0
> 16 21995 9 13 0 0 1 0 0
> 17 21995 11 2 0 0 1 0 0
> 18 19695 10 5 0 0 1 0 0
> 19 16850 8 15 0 0 1 0 0
> 20 16750 8 44 0 0 1 0 0
> 21 19850 8 22 0 0 1 0 0
> 22 21000 9 5 0 0 1 0 0
> 23 17950 7 44 0 0 1 0 0
> 24 17995 11 6 0 0 0 1 0
> 25 16295 10 14 0 0 0 1 0
> 26 16495 9 4 0 0 0 1 0
> 27 15995 8 15 0 0 0 1 0
> 28 13995 9 7 0 0 0 1 0
> 29 16995 11 11 0 0 0 1 0
> 30 15595 9 14 0 0 0 1 0
> 31 11995 7 11 0 0 0 1 0
> 32 13195 8 13 0 0 0 1 0
> 33 8995 5 31 0 0 0 1 0
> 34 17500 8 25 0 0 0 1 0
> 35 15450 8 25 0 0 0 1 0
> 36 14750 10 21 0 0 0 1 0
> 37 15750 8 16 0 0 0 1 0
> 38 12250 7 27 0 0 0 1 0
> 39 10995 10 13 0 0 0 0 1
> 40 10995 10 12 0 0 0 0 1
> 41 10495 9 24 0 0 0 0 1
> 42 8995 7 34 0 0 0 0 1
> 43 10295 9 7 0 0 0 0 1
> 44 6450 5 41 0 0 0 0 1
> 45 10950 8 29 0 0 0 0 1
> 46 9750 8 21 0 0 0 0 1
> 47 6950 4 35 0 0 0 0 1
> price date mileage mod500 mod450 mod380 mod280 mod200
> 48 9950 9 11 0 0 0 0 1
> 49 8750 9 40 0 0 0 0 1
> 50 10750 10 12 0 0 0 0 1
> 51 4950 4 57 0 0 0 0 1
> 52 9250 10 23 0 0 0 0 1
> 53 9250 8 21 0 0 0 0 1
> 54 8995 9 24 0 0 0 0 1
> -----------------------------------------------------------------------
> This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
> message: unsubscribe s-news
>
>
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|