Hi,
I'm using Environmental Stats module to fit a best distribution to a
single column of data using KS method. Based on the p.value I get the
best distribution. My question is what commands do I write to
automaticaaly get the best distribution based on p value. Here in this
example I'm manually looking for the best distribution based on the p
value.(Beta Distribution) . Once the distribution has been identified
I'm generating a random numbers from that distribution using the parameters.
# FIT WEIBULL DISTRIBUTION USING K.S METHOD AND OBTAIN THE P-VALUE
fit.weibull<-ks.gof(indata$Loss.AM.Lakhs,distribution="weibull")
summary(fit.weibull)
p.weibull<-fit.weibull$p.value
statistic.weibull<-fit.weibull$statistic
fit.weibull$distribution
plot.gof(fit.weibull,plot.type="Observed Distribution Overlaid with
Fitted Distribution")
plot.gof(fit.weibull,plot.type="Observed CDF Overlaid with Fitted CDF")
plot.gof(fit.weibull, plot.type="Result")
# FIT LOGNORMAL DISTRIBUTION USING K.S METHOD AND OBTAIN THE P-VALUE
fit.lognormal<-ks.gof(indata$Loss.AM.Lakhs,distribution="lnorm")
summary(fit.lognormal)
p.lognormal<-fit.lognormal$p.value
statistic<-fit.lognormal$statistic
plot.gof(fit.lognormal,plot.type="Observed Distribution Overlaid with
Fitted Distribution")
plot.gof(fit.lognormal,plot.type="Observed CDF Overlaid with Fitted CDF")
plot.gof(fit.lognormal,plot.type="Result")
# FIT PARETO DISTRIBUTION USING K.S METHOD AND OBTAIN THE P-VALUE
fit.pareto<-ks.gof(indata$Loss.AM.Lakhs,distribution="pareto")
summary(fit.pareto)
p.pareto<-fit.pareto$p.value
statistic<-fit.pareto$statistic
plot.gof(fit.pareto,plot.type="Observed Distribution Overlaid with
Fitted Distribution")
plot.gof(fit.pareto,plot.type="Observed CDF Overlaid with Fitted CDF")
plot.gof(fit.pareto,plot.type="Result")
# FIT BETA DISTRIBUTION USING K.S METHOD AND OBTAIN THE P-VALUE
fit.beta<-ks.gof(indata$Loss.AM.Lakhs/1000,distribution="beta")
summary(fit.beta)
p.beta<-fit.beta$p.value
statistic<-fit.beta$statistic
#plot.gof(fit.beta,plot.type="Observed Distribution Overlaid with Fitted
Distribution")
#plot.gof(fit.beta,plot.type="Observed CDF Overlaid with Fitted CDF")
plot.gof(fit.beta,plot.type="Result")
# FIND THE BEST DISTRIBUTION BASED ON P-VALUE
mydata <-c(p.weibull,p.lognormal,p.pareto,p.beta)
sort(mydata)[length(mydata):1]
max(mydata)
# OBTAIN THE PARAMETERS OF BEST (BETA DISTRIBUTION)
a <-fit.beta$distribution.parameters
# GENERATE b No. OF RANDOM NUMBERS FROM BETA DIST
dat <- rbeta(b[i], shape1=a[1], shape2=a[2])
dat1<-dat*1000
Thank you in advance.
Best Regards,
Bikash Jain