s-news
[Top] [All Lists]

Modeling very rare events

To: <s-news@lists.biostat.wustl.edu>
Subject: Modeling very rare events
From: stephen.veronneau@faa.gov
Date: Fri, 22 Apr 2005 14:59:58 -0500

At our institute we keep information on 3 million pilots (12 million exam records) that are matched with accident outcome datasets and pilot registry datasets. We are interested in modeling the very rare outcome of an aviation accident or incident. In a Windows software environment we use IMiner3 and SPlus6.2 (soon to be SPlus7) to visualize and analyze the data.

How rare is rare?:
1.11 accidents per 100,000 hours flown. Only 1.1% of pilots in a 11 year period had an accident. 15,652 of 1.4 million airmen.
In our study of subgroups of pilots with one particular medical condition, for instance, we have set of pilots 5,243 out of the 1.4 million pilots (0.37%) in a 11 year time frame

Of pilots who have sustained an accident, there are even fewer that have gone on to further accidents. From 1993-2003:
Number of Mishaps Count of Pilots
1 15,652
2 514
3 26
4 2
5 1


Within SPlus I have looked at logistic and probit regression but the large number of counts of non-events (0) compared to the number of accident events (1) seems to require additional consideration.
My question is what regression or other modeling techniques would be best suited to studying these binary events with a very large of preponderance of outcome counts towards one end of the binary outcomes?

Thanks for so many excellent postings.

Stephen Véronneau MD
Bioinformatics Research Team Lead

FAA Civil Aerospace Medical Institute
<Prev in Thread] Current Thread [Next in Thread>
  • Modeling very rare events, stephen . veronneau <=