All,
I am using S-PLUS 6 R2 on Windows XP
I have a probability density function (mypdf) from a large dataset ( > 10000
observations) calculated using the density() function. It looks like this:
> mypdf
$x:
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
$y:
[1] 0.000746826 0.007841673 0.006721434 0.012696042 0.014936520 0.028752800
0.032113519 0.051530994 0.061613142 0.075802840 0.081404030
[12] 0.093726665 0.079910383 0.070575058 0.057505600 0.049290515 0.029126214
0.029126214 0.022031367 0.022031367 0.026138909 0.020911127
[23] 0.019790888 0.015683345 0.011575803 0.011575803 0.012322629 0.009708738
0.009708738 0.008215086 0.008215086 0.007094847 0.003360717
[34] 0.001493652 0.002987304 0.001867065 0.001120239 0.000373413 0.000000000
0.000373413
The values in mypdf$x are speeds. I looked for a function to calculate the
CDF, and ended up using cumsum() as a proxy. I found cdf.compare, but I don't
want to compare anything at this point, just look at it. I also want to
calculate the probablility of exceeding a certain value in $x, say 20. In
order to do this, I had to brute-force it using the cumsum() results and
matching them with a value of the same index in $x. The S-PLUS probability
functions that I found in the Guide to Statistics all require the designation
of a theoretical distribution. Is there a function that will calculate the CDF
of a dataset and the probability of occurrence of a value in that dataset
without having to have a priori knowledge of a theoretical distribution? Or
without having to manually write code to do it? Something that I can input a
value of $x into (like '20') and have it automatically correspond to a CDF
value?
I would think that with this many data-points, I would not need to fit the data
to a distribution since the sample is large enough to reveal the 'true'
distribution of the data. Not being a statistician, I could be wrong on that
point. Nonetheless, a function like I describe above would still be useful.
S-PLUS being the wonderful software that it is, I assume it exists, I just
can't find it. Thanks.
***********************************************************************
Winifred C. Lambert Senior Scientist/Meteorologist
ENSCO, Inc.
Aerospace Sciences and Engineering Division
1980 N. Atlantic Ave, Suite 230
Cocoa Beach, FL 32931
VOICE: 321.853.8130 FAX: 321.853.8415
lambert.winifred@ensco.com
AMU Quarterly Reports are available online:
http://science.ksc.nasa.gov/amu/home.html
***********************************************************************
|