Dear Splus users,
I have a problem somewhat related to Roy Robertson's regarding
over-dispersed Poisson models.
I am trying to develop, in a first step, a Poisson model for a data set
with a very large number of cells (140,000) and roughly 20 categorical
variables with several levels each. From the start I suspected
over-dispersion and tried to estimate it fitting the full model to the data
using the function glm with family=quasi(link='log',var='mu'). But glm
runs for more than an hour to finally tell me that convergence was not
attained in 10 iterations, it still returns parameter estimates and also an
estimate of dispersion of roughly 7.8........... Now if I run glm for the
NULL model with the other arguments remaining the same I get an estimate of
2.4....... for the dispersion. I would like to use the dispersion to help
me select the most significant drivers (probably using the function add1)
to get the modeling started; of course the significance level of the
variables is dependent on the value of the dispersion parameter.
I know that I could increse the default number of iterations in glm, but I
am reluctant to do it since it takes so much time to run the way it is now.
I am running Splus 4.5 under Windows NT on a pretty powerfull machine
(Pentium II, 350 Mz, 128 M of RAM).
Any suggestions on how to tackle the problem?
Thanks in advance,
Gérald Jean
Analyste-conseil (statistiques), Actuariat
télephone : (418) 835-8839
télecopieur : (418) 835-5865
courrier électronique: gerald.jean@spgdag.ca
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|