s-news
[Top] [All Lists]

Re: predict.lm query

To: Stephen Kaluzny <skaluzny@tibco.com>
Subject: Re: predict.lm query
From: Kim.Elmore@noaa.gov
Date: Tue, 17 Mar 2009 07:52:17 -0500
Cc: snews <s-news@wubios.wustl.edu>
Thanks, Stephen.

Yes, indeed!  That's exactly what I did. Shame on me. I didn't make that
connection looking at the examples and missed the warnings about it.

Thanks!

Kim Elmore

----- Original Message -----
From: Stephen Kaluzny <skaluzny@tibco.com>
Date: Tuesday, March 17, 2009 1:12 am
Subject: Re: [S] predict.lm query
> On Mon, Mar 16, 2009 at 09:39:07PM -0500, Kim.Elmore@noaa.gov wrote:
> > This is so simple, I'm embarrassed. Nevertheless...
> > 
> > I used a data frame with 841 rows (called rdrdata.train) to 
> generate an
> > lm object. I then tried ti use predict.lm on another data frame
> > containing new data with which I wanted to test the performance 
> of my lm
> > object. My new data has 361 rows (same predictors). 
> > 
> > I get an error message that says 
> > 
> > Problem in model.matrix.default(delete.response(Terms), ..: 
> Length of
> > rdrdata.train[, 2] (variable 1) is 841 != length of others (361) 
> > Use traceback() to see the call stack
> 
> I suspect that you wrote the formula for the lm model using the data
> frame and subscript operators e.g.
> 
>    lmObj <- lm(rdrdata.train[, 3] ~ rdrdata.train[, 2] + 
> rdrdata.train[, 3])
> 
> perhaps with a data=rdrdata.train (but that is not needed for a call
> like that).  Your lmObj formula is now in terms of the rdrdata.train
> i.e. look at formula(lmObj).
> 
> You then tried to do predictions with lmObj on a another data frame,
> rdrdata.score, perhaps:
> 
>    predict(lmObj, newdata=rdrdata.score)
> 
> You need to write the formula for lm with the names of the columns of
> the data frame, not the data frame itself. Your rdrdata.score data 
> frameshould then contain columns with the same names. E.g.
> 
> If names(rdrdata.train) gives:
> 
>   "x1" "x2" "y"
> 
> then fit the lm with:
> 
>   lmObj <- lm(y ~ x1 + x2, data=rdrdata.train)
> 
> and predict with:
> 
>    predict(lmObj, newdata=rdrdata.score)
> 
> where names(rdrdata.score) is also:
> 
>    "x1" "x2" "y"
> 
> -Stephen
> 

<Prev in Thread] Current Thread [Next in Thread>