Douglas Bates wrote:
On Dec 20, 2007 1:05 PM, Tim Hesterberg <timh@insightful.com> wrote:
I have ran into a situation with an editor: he decidedly wants an
R-squared value for one of my Non-linear regression trials. It was
my understanding that r-squared was not suited for non-linear
reporting, plus it is not reported in splus.
It would be appropriate to report R^2, as a descriptive measure of
goodness of fit of the regression.
You would need to be careful about the definition of R^2. If the
constant model is not contained in the nonlinear model then the
definition of R^2 given below is not appropriate. This does happen
for many nonlinear regression models, including many pharmacokinetic
models, such as the SSfol (self-starting first-order elimination with
logarithms of rate constants) model. These models cannot be reduced
to an arbitrary constant, even in the limit as one or more of the
parameters goes to +/- infinity. In these cases the only sensible
definition of R^2 would be relative to the model in which all the
fitted values are zero (similar to the case of R^2 in a linear
regression model without an intercept term). It was because it is not
easy (if, indeed, it is possible at all) to distinguish such cases,
that John Chambers and I did not include an R^2 value in the original
version of the nls summary written for S many years ago.
The caution against pseudo-R^2 measures in general (of which this is a
special case) is to me
a sound one. The R^2= 1- (sse/sst) formula has the unfortunate property
of sometimes leading
to negative values of R^2 (if sst is the corrected total sum of
squares) for the reason given above.
An alternative that does not at least have this property is R^2 =
[corr(y^hat, y)]^2; this does
have the attractive intuition of being directly related to how the
fitted values track the observed
values, but it has the disadvantage of not being able to calibrate the
fitted values (if the fitted
values were always 1/2 the observed values, this R^2 would be 1, even
though predictions from
the model would be terrible).
I always tell my students that the SST = SSE + SSR formula for linear
least squares regression is
the most amazing formula they'll never appreciate, and these kinds of
difficulties in defining a good
R^2 value are just one aspect of that.
Jeff
Be careful about using the R^2 in a hypothesis test for significance
of the regression - the distribution of the R^2 statistic in nonlinear
regression may not be the same as in linear regression.
I have been using r^2= 1- (sse/sst) is this a suitable method for NLS?
That gives raw R^2.
It would be better to report an adjusted R^2, adjusted using the estimated
degrees of freedom of the nonlinear regression or smoother.
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
--
========================================================================
Jeffrey S. Simonoff Phone: (212) 998-0452
Professor of Statistics
Leonard N. Stern School of Business Fax: (212) 995-4003
New York University
44 West 4th Street, Rm. 8-54 e-mail: jsimonof@stern.nyu.edu
New York, NY 10012-1106
USA WWW: http://www.stern.nyu.edu/~jsimonof
========================================================================
|
|