s-news
[Top] [All Lists]

Developing a parsimonious mixed logistic regression model

To: <s-news@lists.biostat.wustl.edu>
Subject: Developing a parsimonious mixed logistic regression model
From: "Hunsicker, Lawrence" <lawrence-hunsicker@uiowa.edu>
Date: Mon, 5 Jan 2009 10:41:43 -0600
Thread-index: AclvVH3nbpdyrMHMRpei6wtAK810mw==
Thread-topic: [S] Developing a parsimonious mixed logistic regression model

Good morning and Happy New Year, folks:

Today I have a statistical, rather than a programming problem for you folks.  I am trying to develop a mixed logistic model to predict (and perhaps to explain) which patients in a cohort get a certain test done.  The patients are nested within center within country within region.  There are about 25,000 patients and about half have had the test done.  Id like to test the importance of a small number of covariates (definable a priori), but because this is a voluntary data set with variable inclusion criteria from center to center, Id like to correct for a batch of other variables that may contribute noise to the analysis.

First, I understand that with 25,000 patients, it is probably really not necessary to develop a parsimonious model.  I could just throw all the nuisance covariates into the model and then ignore them.  (And I may well choose to do just this.)  But it is traditional to generate a parsimonious model.

So now to the question.  To optimize any specific model, I will be using either partial quasilikelihood, or partial likelihood, or some other criterion other than simple likelihood ratios.  This is fine for optimizing the parameters within a model, but it doesn’t necessarily give me a criterion for choosing between alternative models.  In particular, AIC is not defined for most of these models, and even the log likelihood may not be available.  So neither stepAIC, nor step, nor stepwise works to develop a parsimonious mixed logistic model.  (In fact, if you call one of these methods, you get an error return saying that the function wont work on this sort of base model.)

So how does one choose amongst models optimized using a criterion other than log likelihood?  Is there any mathematically accurate way?  And if there is (and this, I suppose, is the programming question) does an S or R function exist that will automate development of a parsimonious model?

Thanks in advance to any of you that can give me help on this question.  Again, Happy New Year to all.

Larry Hunsicker

Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited.  Please reply to the sender that you have received the message in error, then delete it.  Thank you.

<Prev in Thread] Current Thread [Next in Thread>