On Thu, 29 Jan 2004 16:57:31 -0200
"fmedeiros" <fmedeiros@uol.com.br> wrote:
> Dear All, I was asked to criticize the following strategy for logistic
> regression model building and I would appreciate any comments from the
> list (positive or negative):
>
> 1. After exclusion of variables based on subject knowledge, 25 variables
> were considered
reasonable
> as possible candidates (sample size ~200, with smaller group~ 90)
> 2. An all-subsets regression technique was employed and the model chosen
> was the one with the largest bias-corrected (bootstrap, B=200) ROC curve
> area
not very reasonable. Better to do data reduction (ignoring Y) and fit a
full model on the reduced set.
> 3. _To check stability of the model_, another bootstrap procedure was
> performed (B=200 again) and step 2 was included in every bootstrap
> sample (? Double bootstrap). So, each bootstrap resample had its _best
> model_ with its associated optimism (obtained from step 2).
Good idea to do a double bootstrap for this type of strategy
> 4. The number of times that each variable appeared in the 200 bootstrap
> samples (200_best models_, but not necessarily all different) from step
> 3 was reported.
Why? This has severe problems with collinearity and tells you nothing
different that P-values.
> 5. The final model reported was the one obtained after all-subsets
> regression was applied to the original sample, and although the
> variables in this model were the ones who showed the highest frequency
> in the 200 bootstrap samples, it was recognized that several other"best
> models" were possible, which was illustrated by the frequency of
> variables in the 200 bootstrap resamples. The _optimism_ was reported as
> the average of the 200 _optimisms_.
Good idea to look at "close misses" among the models, but I would use a
completely different and simpler strategy. See my book Regression
Modeling Strategies for more info. -Frank Harrell
>
> Fernando Medeiros.
>
>
> Fernando Medeiros, M.D., Ph.D.
> Department of Neurosciences
> University of Sao Paolo
>
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
|