Thanks, Stephen.
Yes, indeed! That's exactly what I did. Shame on me. I didn't make that
connection looking at the examples and missed the warnings about it.
Thanks!
Kim Elmore
----- Original Message -----
From: Stephen Kaluzny <skaluzny@tibco.com>
Date: Tuesday, March 17, 2009 1:12 am
Subject: Re: [S] predict.lm query
> On Mon, Mar 16, 2009 at 09:39:07PM -0500, Kim.Elmore@noaa.gov wrote:
> > This is so simple, I'm embarrassed. Nevertheless...
> >
> > I used a data frame with 841 rows (called rdrdata.train) to
> generate an
> > lm object. I then tried ti use predict.lm on another data frame
> > containing new data with which I wanted to test the performance
> of my lm
> > object. My new data has 361 rows (same predictors).
> >
> > I get an error message that says
> >
> > Problem in model.matrix.default(delete.response(Terms), ..:
> Length of
> > rdrdata.train[, 2] (variable 1) is 841 != length of others (361)
> > Use traceback() to see the call stack
>
> I suspect that you wrote the formula for the lm model using the data
> frame and subscript operators e.g.
>
> lmObj <- lm(rdrdata.train[, 3] ~ rdrdata.train[, 2] +
> rdrdata.train[, 3])
>
> perhaps with a data=rdrdata.train (but that is not needed for a call
> like that). Your lmObj formula is now in terms of the rdrdata.train
> i.e. look at formula(lmObj).
>
> You then tried to do predictions with lmObj on a another data frame,
> rdrdata.score, perhaps:
>
> predict(lmObj, newdata=rdrdata.score)
>
> You need to write the formula for lm with the names of the columns of
> the data frame, not the data frame itself. Your rdrdata.score data
> frameshould then contain columns with the same names. E.g.
>
> If names(rdrdata.train) gives:
>
> "x1" "x2" "y"
>
> then fit the lm with:
>
> lmObj <- lm(y ~ x1 + x2, data=rdrdata.train)
>
> and predict with:
>
> predict(lmObj, newdata=rdrdata.score)
>
> where names(rdrdata.score) is also:
>
> "x1" "x2" "y"
>
> -Stephen
>
|