s-news
[Top] [All Lists]

Minimize the Log Likelihood to fit Mixture Models ( ?? )

To: "S-News" <s-news@lists.biostat.wustl.edu>
Subject: Minimize the Log Likelihood to fit Mixture Models ( ?? )
From: "Paul Lasky" <phlasky@earthlink.net>
Date: Thu, 13 Jul 2006 12:15:03 -0700
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=earthlink.net; b=kNbr+1v7vfldHRR6gBCGmTYwnC7LMs6qquOQgITcoqOpMMfl0rEG3vNOhkM4ISip; h=Received:From:To:Subject:Date:MIME-Version:Content-Type:X-Mailer:X-MimeOLE:thread-index:Message-ID:X-ELNK-Trace:X-Originating-IP;
Thread-index: AcamsHTjZgz4huD8S7qPZdVw+PmXiw==
    I've found a very strange and counter-doctrinal result. I wrote a simple script that uses the standard EM algo ( e.g. see The Elements of  Statistical Learning, Hastie, Tibshirani, Friedman, sec. 8.5, pp.236 ) to fit data density to a mixture of two normal distributions.
 
    My script  Maximizes the log likelihood using the EM but then loops the whole procedure using new and randomally selected starting conditions.  I have found that the best fits are produced when in the outer --- start condition --- loop selects the  minimal   log likelihood value  produced by the inner EM routine, i.e. Min { over start cond's. ,Max{ over mu, sig, and p vectors ( log likelihood( mu, sig, p | start cond's ) } }
   
    When the outer loop is re-programmed to select the usual  Maximum value of the log likelihood from all the start conditions alternatives, --- ( Max Max ( log lik )  --- it produces ambiguous results where the two normal distributions are nearly identical, i.e. possess nearly equal means and standard deviations.
 
    Can anyone explain this seemingly heretical result ?  Can it be that the EM algo is trying to find the spurious globally max likelihood result of a single spike of density at any one data point ?
 
  Paul H. Lasky
  P & B Consultants
<Prev in Thread] Current Thread [Next in Thread>
  • Minimize the Log Likelihood to fit Mixture Models ( ?? ), Paul Lasky <=