On Nov 27, 2006, at 10:49 PM, Frank E Harrell Jr wrote:
Inman, Brant A. M.D. wrote:
S-experts: I am using the Design and Hmisc libraries to fit a
logistic regression
model in a manner similar to that given below (note that this model may
not make sense, I just use it to demonstrate my question):
[...]
ddist <- datadist(pbc)
options(datadist='ddist')
fit <- lrm(edema ~ age + alb + ascites + hepmeg, data=pbc)
--------------------------
Since my goal is to extract the odds ratios from this model, I have
tried two different methods: the obvious method of exponentiation of the
regression coefficients and using the summary.Design function.
--------------------------
exp(fit$coef)
summary(fit)
--------------------------
The problem that I have is with the "summary(fit)" output, namely that
its odds ratios are different from those of "exp(fit$coef)". From the
documentation that is provided by the function's author, I surmise that
the reason the odds ratios for the continuous variables are different is
that the function calculates the odds ratios using "inter-quartile
effects". I have not previously encountered this method of computing
odds ratios and would appreciate any advice from the experts regarding
the reasons for using the inter-quartile method for calculating the odds
ratios rather than the usual method. Which method should be reported in
a paper and is there evidence/publications to support this choice?
Please see my book Regression Modeling Strategies which details the
reasons for this and for not doing exp(coef). A one-unit change is
not meaningful for many of the variables we see in biomedical research.
Frank
Brant Inman
Mayo Clinic
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
Brant,
My additional comments:
For continuous variables that are found to be approximately linear in
the logit
(perhaps only due to limited power to detect non-linearity), I report an
odds ratio based on a difference that is easy to understand or clinically
meaningful: e.g., a 10-year difference in age, or a 0.5 cm difference in
lesion size.
If it's hard to come up with a meaningful difference, if I've transformed a
variable, or if it enters into the model non-linearly, then I follow the
above
reference and report an odds ratio based on Q3 vs. Q1 since a k-unit
change in
the variable does not have the same effect across the distribution of
values.
I've had some success explaining the use of Q3 vs. Q1 as "comparing a
person
with a typical above-average value to one with a typical below-average
value".
The one drawback to the Q3 vs. Q1 approach is that a careless reviewer
may think
you've cut the variable into quartiles, fit the model, and reported ORs
based on
this. So a few sentences justifying Q3 vs. Q1 ORs may be worth including
in your
report.
Hope this helps,
Stephen Weigand
Rochester, Minnesota, USA