s-news
[Top] [All Lists]

Re: [S] reasonable p-values for Fisher exact's test - WAS strange ...

To: cberry@tajo.ucsd.edu (Charles C. Berry)
Subject: Re: [S] reasonable p-values for Fisher exact's test - WAS strange ...
From: "Patrick Connolly" <PConnolly@grunt.marc.cri.nz>
Date: Wed, 25 Mar 1998 13:19:18 +1200 (NZST)
Cc: s-news@wubios.wustl.edu (Snews)
In-reply-to: <351851FA.563F4B96@tajo.ucsd.edu> from "Charles C. Berry" at Mar 24, 98 04:38:18 pm
Reply-to: "Patrick Connolly" <PConnolly@grunt.marc.cri.nz>
Sender: owner-s-news@wubios.wustl.edu
According to Charles C. Berry:
|> 
|> Before this thread enters an infinite loop, a few observations:
|> 
|> First, class(fisher.test(etc) ) == "htest"
|> 
|> So, print.htest() will format the results of fisher.test(). This is done
|> as follows
|> 
|>   cat("p-value =", format(round(x$p.value, 4)), "\n")
|> 
|> (on Version 3.4 Release 1 for Sun SPARC, SunOS 4.1.3_U1 : 1996)
|> 
|> So the reports that fisher.test() *seemed* to work OK only imply that
|> the first 5 digits were OK.
|> 
|> Also, note that fisher.test() uses an algorithm which allows R x C
|> tables. This isn't required in simple 2 x 2 tables (and it wouldn't be
|> too hard to put in a switch for such tables), but this is what gets
|> used. 
|> 
|> Getting to the point:
|> 
|> This algorithm usually yields answers that differ numerically from the
|> exact hypergeometric probability, viz  the result of:
|> 
|> > fisher.test(matrix(c(0,2,2,2),nc=2))$p
|> [1] 0.4666666
|> 
|>  differs from 
|> 
|> > dhyper(0:2,2,4,2)
|> [1] 0.40000000 0.53333333 0.06666667
|> 
|> by an amount
|> 
|> > fisher.test(matrix(c(0,2,2,2),nc=2))$p-sum(dhyper(c(0,2),2,4,2))
|> [1] -2.78155e-08
|> > 
|> 
|> And this isn't an isolated case. The following summaries are of numbers
|> that all equal zero under exact (and obvious) arithmetic:
|> 
|> > summary(sapply(1:20,function(x) 
fisher.test(matrix(c(1,1,x,x),nc=2))$p-1.0))
|>        Min.    1st Qu.     Median       Mean   3rd Qu.      Max. 
|>  -1.407e-05 -1.997e-06 -8.941e-08 -3.189e-07 1.192e-06 1.562e-05
|> > summary(sapply(1:20,function(x) 
fisher.test(matrix(c(2,2,x,x),nc=2))$p-1.0))
|>        Min.    1st Qu.    Median       Mean   3rd Qu.      Max. 
|>  -2.325e-05 -2.295e-06 2.384e-07 -7.927e-07 2.712e-06 7.868e-06
|> 
|> Only 1 of 40 , c(1,1,5,5) ,  gives exactly 0.0 as the result.
|> 
|> So, fisher.test() apparently uses an approximation which gives a correct
|> answer for the first 5 or 6 significant digits most of the time.
|> 
|> Even though the table
|> 
|>      matrix(1,nr=2,nc=2)
|> 
|> would obviously lead to a p-value of exactly 1.0, it seems of little
|> practical import that fisher.test() reports it as 
|> 
|> > print(fisher.test(matrix(1,nr=2,nc=2))$p,digits=10)
|> [1] 0.9999998808
|> > 
|> 
|> If this is a problem, then dhyper() can be used in 2 x 2 tables. It
|> seems to generate results that are close to machine accuracy.
|> 
|> -- 
|> 
|> Charles C. Berry                        (619) 534-2098 
|>                                          Dept of Family/Preventive
|> Medicine
|> E mailto:cberry@tajo.ucsd.edu                 UC San Diego
|> http://hacuna.ucsd.edu/members/ccb.html  La Jolla, San Diego 92093-0622
|> -----------------------------------------------------------------------
|> This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
|> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
|> message:  unsubscribe s-news
|> 

-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>