Hi,
I see some strange behavior when generating the fitted values from an lm
object. Consider:
> version
Version 6.0 Release 1 for Sun SPARC, SunOS 5.6 : 2000
> set.seed(9)
> tempDF <- data.frame( y = rnorm(6), f = as.factor(paste("level", rep(1:2,3),
> sep = "")))
> tempDF
y f
1 -0.2009222 level1
2 0.4997202 level2
3 0.3564666 level1
4 0.4703904 level2
5 -0.4061262 level1
6 -1.6078728 level2
> fitted(lm(tempDF$y ~ tempDF$f))
1 2 3 4 5 6
-0.08352725 -0.2125874 -0.08352725 -0.2125874 -0.08352725 -0.2125874
At this point, everything looks OK. The fitted values are the mean of the
response at the level of each factor.
> tapply(tempDF$y, tempDF$f, mean)
level1 level2
-0.08352725 -0.2125874
Unfortunately, the results are not *identical* within each level. Consider:
> unique(fitted(lm(tempDF$y ~ tempDF$f)))
[1] -0.08352725 -0.21258740 -0.08352725 -0.21258740 -0.21258740
> fitted(lm(tempDF$y ~ tempDF$f))[1] == fitted(lm(tempDF$y ~ tempDF$f))[3]
1
F
> options(digits = 15)
> fitted(lm(tempDF$y ~ tempDF$f))[1:3]
1 2 3
-0.0835272529974701 -0.212587401068558 -0.0835272529974702
Any insights on this problem would be much appreciated.
Thanks,
Dave Kane
--
David Kane
Geode Capital Management
617-563-0122
david.d.kane@fmr.com
|