s-news
[Top] [All Lists]

Re: Permutation test?!

To: Weigl Klemens <weiglk@stud.uni-graz.at>
Subject: Re: Permutation test?!
From: Tim Hesterberg <timh@insightful.com>
Date: Mon, 17 Sep 2007 09:52:41 -0700
Cc: s-news@lists.biostat.wustl.edu
In-reply-to: <1189858789.46ebcde56932d@webmail.stud.uni-graz.at> (message from Weigl Klemens on Sat, 15 Sep 2007 14:19:49 +0200)
References: <1189858789.46ebcde56932d@webmail.stud.uni-graz.at>
Normally, to do a permutation test for the correlation between x and y,
you would permute one of the variables, not pool the data.
E.g. if x and y are variables in a data frame, then:

> data <- data.frame(x = rnorm(100), y = rnorm(100))
> permutationTest(data, resampCor, resampleColumns = "x")
Call:
permutationTest(data = data, statistic = resampCor, resampleColumns = "x")

Number of Replications: 999 

Summary Statistics:
         Observed     Mean      SE alternative p-value 
cor(x,y)  -0.0564 0.004596 0.09524   two.sided   0.514


This permutes the "x" column, leaving "y" fixed.  It is appropriate
for testing whether the relationship between x and y is significant,
against the alternative of independence.

In contrast, pooling the data would be appropriate for comparing two
samples, e.g. is a difference in means (or another statistic)
significantly different from zero, or is a ratio of means (or another
statistic) significantly different from one.

Tim Hesterberg

========================================================
| Tim Hesterberg       Senior Research Scientist       |
| timh@insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
I'll teach short courses:
Bootstrap Methods and Permutation Tests
                Oct 10-11 San Francisco
Advanced Programming in S-PLUS
                Oct 8-9 San Francisco
http://www.insightful.com/services/training.asp


>Dear S-News Members!
>
>
>I´ve got a particular problem and I´m not sure if a permutationtest is the
>solution to solve it, though I´d intuitively use it. 
>
>
>
>
>(1) If I´ve got two samples x= (x1, x2,..., xn) and y= (y1, y2,..., yn) (both
>have got the same sample-size n) which I´d like to put together to actually one
>big sample. 
>(2) Then I´d like to draw "without replacement" n elements by chance from the
>created whole sample, building the first NEW sample. 
>(3) All the elements left in the big sample, created before, which weren´t
>drawn, build automatically the second NEW sample. 
>(4) Then I´d like to evuluate to statistic of interest - a
>correlation-coefficient. 
>(5) If I repeat this procedure B=1000 times, then I should get an empirical
>estimate of the variability of the correlationcoefficient, which I´m interested
>in. 
>
>
>To do the Steps (1) to (5) I´ve used the Permutationtest from the resample -
>Library. Here´s the syntax:
>> permutationTest(data = XY [c("x", "y")], statistic = 
>> resampCor(data, resampleColumns = "x"), alternative = 
>> "two.sided", resampleColumns = "x")
>
>
>My questions are:
>i) Is this the right syntax to do the steps (1) to (5)? 
No, but I doubt you want to do that.

>ii) Furhtermore, if I use  --> resampleColumns = "x", which is recommended in
>the S-Plus Resample Help to use for Correlations, how should I exactly
>understand this? -> Is there only the Column x permutated and the Column y left
>behind? 
Yes.

>iii) Most important question: Exists there another, better solution to do (1) 
>to
>(5)?
See above.

>
>Please, can anybody help me. 
>Thank you!
>
>Yours sincerly, 
>Weigl Klemens

<Prev in Thread] Current Thread [Next in Thread>