s-news
[Top] [All Lists]

Re: predictions using log-transformed response variables

To: "Schwarz,Paul" <PSchwarz@gcrinsight.com>
Subject: Re: predictions using log-transformed response variables
From: David L Lorenz <lorenz@usgs.gov>
Date: Mon, 11 Sep 2006 08:08:06 -0500
Cc: s-news@lists.biostat.wustl.edu, s-news-owner@lists.biostat.wustl.edu
In-reply-to: <3BA5796D541C8C4EA6D0618F427A313785CD69@CHOCOLATE.GCRInsight.com>

Paul,
  Correcting for transformation bias really depends on what you are trying to do. The model that you specified gives an estimate of the median response in log-space and in real-world space. If you need an estimate of the mean response, then there are a couple of options, one of which is Duan's smearing estimator.
  Duan's method is to compute the estimate, add the value of each residual to that estimate, back transform, and then take the mean of those data as an estimate of the mean response. It works for any transformation and is easy to implement in S-PLUS.
  If you can assume that the data are log-normally distributed, you can compute a slightly biased back-transformation correction factor based on the properties of the log-normal distribution. The correction factor is exp(mean(residuals)/2). Multiply the back-transformed estimate by that correction factor. I know that there are FORTRAN versions of a minimum variance unbiased estimate (MVUE) of this correction factor, but I do not know if any have been put into S-PLUS or R.
Dave


"Schwarz,Paul" <PSchwarz@gcrinsight.com>
Sent by: s-news-owner@lists.biostat.wustl.edu

09/10/2006 02:25 AM

To
<s-news@lists.biostat.wustl.edu>
cc
Subject
[S] predictions using log-transformed response variables





S-News readers,

I know that this is more of a statistical issue than an S-PLUS issue,
but I was hoping that someone would kindly summarize for me the issues
related to making predictions e.g, using predict(), involving models
with a log-transformed reponse variable. For example, if a linear model
is fitted using lm(log(y) ~ x1 + x2, data= "" what is the proper way
to make predictions using the model? I've heard about a so-called
"smear" factor, but I'm not clear about what it is, or when to apply it,
or how to calculate it. For example, are there standard S-PLUS functions
for calculating a smear factor, or is there an option with the predict
functions? If someone would clarify this issue for me, I would be most
grateful.

Thank you for your time and consideration.

-Paul Schwarz

--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>