The following is a simple example to illustrating a bizarre feature I
noticed.
The problem relates to the multiple r-squared value obtained using lm.
The following creates two vectors, x and y.
> x_seq(1,10,length=100)
> y_2+x*4+rnorm(100,0,4)
> plot(x,y)
Now perform two fits:
> fit.a_lm(y~x)
> fit.c_lm(y~rep(1,100)+x-1)
As you can see fit.a and fit.c are the same fit. fit.c says fit without an
intercept but fit with a vector of 1's. As one would expect, the
coefficients from the two models are the same:
> coef(fit.a)
(Intercept) x
1.512145 4.173989
> coef(fit.c)
rep(1, 100) x
1.512145 4.173989
The R^2 values, however, are not the same.
> summary(fit.a)$r.squared
[1] 0.8680936
> summary(fit.c)$r.squared
[1] 0.9752623
It seems as if the R^2 value from fit.a is correct:
> cor(fitted(fit.a),y)^2
[1] 0.8680936
> cor(fitted(fit.c),y)^2
[1] 0.8680936
Any explanations as to the different behavior?
Thanks.
Jason
|