s-news
[Top] [All Lists]

Re: random permutation of ranks

To: "Manoel Pacheco" <mpacheco@glec-oh.com>
Subject: Re: random permutation of ranks
From: Tim Hesterberg <timh@insightful.com>
Date: 16 Jul 2004 16:49:37 -0700
Cc: <s-news@lists.biostat.wustl.edu>
In-reply-to: <000c01c46b85$738169e0$0efea8c0@MANOLO> (mpacheco@glec-oh.com)
References: <000c01c46b85$738169e0$0efea8c0@MANOLO>
>I ranked a list of chemicals by two distinct variables (one is a "Standard"
>and the other a surrogate), and made an initial attempt to compare the two
>sets of ranks by computing the sum of absolute differences between them.
>Next, I generated random permutations of the "standard" ranks and computed
>the sum of absolute differences between them. The random permutation of
>ranks was performed with the code below:
>
>        dif <- vector("numeric",10000)
>        rstand <- rank(Standard)
>        for (i in 1:10000) {
>            smp <- sample(rstand, replace=FALSE)
>            dif[i] <- sum(abs(smp-rstand)
>            }
>
>       randif <- sort(dif)
>
>The observed difference between ranks based on the standard and the
>surrogate variable was less than the lowest value computed from the random
>permutation of ranks, suggesting that the surrogate variable is adequate. Is
>there a more elegant way to compare the two sets of ranks?

I think you have a subtle error there; you should generate
random permutations of the Surrogate ranks, for comparing to the
standard ranks.  E.g.
  dif <- vector("numeric",10000)
  rstand <- rank(Standard)
  rSurrogate <- rank(Surrogate)
  for (i in 1:10000) {
      smp <- sample(rSurrogate, replace=FALSE)
      dif[i] <- sum(abs(smp-rstand)
      }

This matters if there are ties.

It is also conceptually cleaner -- you want to compare the
value of an observed statistic with the same statistic computed
under permutations.  Your observed statistic is
        sum(abs(rank(Standard) - rank(Surrogate)))
so your comparison should be:
        sum(abs(rank(Standard) - rank(permuted version of Surrogate)))
not
        sum(abs(rank(Standard) - rank(permuted version of Standard)))

--------------------------------------------------
Back to your original question,
you could use the inner product of the two sets of ranks
(this is equivalent in a permutation test to the correlation
between ranks),
or inner product between normal scores computed from each rank,
e.g. qnorm(rstand/(length(rstand)+1))

--------------------------------------------------
Finally,
Whatever statistic you choose, you may use permutationTest
to do the calculations, e.g.

permResults <- permutationTest( rSurrogate, sum(abs(rSurrogate - rstand)))
permResults
plot(permResults)


permutationTest() is part of the resample library, see below.

Tim Hesterberg

========================================================
| Tim Hesterberg       Research Scientist              |
| timh@insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)802-2500 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Download the S+Resample library from www.insightful.com/downloads/libraries


<Prev in Thread] Current Thread [Next in Thread>