Hello S-newsers. I am trying to fit some data to a negative binomial
distribution: I have ~70 sets of count data and want to fit each set
separately, mainly to determine the extent to which the zeros are in
excess of expectations, but also to assure myself that the NB is the
correct distribution to describe these data, which should be
distributed as an overdispersed Poisson (with the possible exception
of extra zeros).
(Note re NB: There are two ways of formulating it, one appropriate to
Bernoulli trials and the other to continuous distributions. The
parameter names used in the various formulations are different, and I
have not found a place where these are explained very well. For
example, rnbinom (Splus) has r and p, and rnegbin (MASS) has mu and
theta. Unlike r, theta can take non-integer values.)
I calculated the parameter mu (the mean) and calculated theta using
theta.ml (MASS). However, the ks.gof function using negbinom has
ONLY the discrete formulation, so that can't be used. It is easy
enough to calculate the expected distribution from these parameters
and then run the Kolmogorov-Smirnov test, but I have read that this
test should be used against an expected distribution with KNOWN
parameters, not with parameters calculated from the data.
So given that I have to get the parameters from the data, is my only
choice for testing goodness of fit to simulate? If so, I assume I
could simply calculate the KS statistic repeatedly using samples from
the NB distribution with the estimated parameters and see whether my
sample KS statistic falls within 95% of the values. Is there a
better way to do this?
Thanks....
======================
Dr. Wim Kimmerer
Research Professor of Biology
Romberg Tiburon Center
San Francisco State University
3152 Paradise Drive
Tiburon CA 94920
Ph. (415) 435-7143
Fax (415) 435-7120
http://online.sfsu.edu/~kimmerer/
|