Hi,
That's a nice algorithm, Bill, conveniently packaged. Its nice
that you can generate one combination at a time if you want, and
easily generate contiguous subsets of all combinations. As this
casual onlooker has mentioned once or twice in this space before,
he has an implementation of a somewhat different algorithm for
generating combinations in function combn(). This can be found at
http://www.spg.wau.nl/pv/pub/chasalow/S/win/combinat . Just for
fun I took the meat out of combn() and inserted it into the
structure of Bill's next.subset(). With that wrapper, the two
algorithms were essentially indistinguishable in efficiency for
the minimal testing I did. (They also, rather conveniently, gave
the same answers.) Generating (but not doing anything interesting with)
the 15504 combinations of 1:20 taken 5 at a time took each function
around 45 seconds on a 300 MHz Pentium running S-PLUS 4.5 under Win NT.
For much larger problems, I strongly second Bill's comment that this
"is the sort of territory where dynamically loaded code starts to become
unavoidable."
Function combn(), by the way, includes a "FUN" argument for in-line
processing of the combinations.
Cheers,
Scott
Cereon Genomics
Scott.Chasalow@cereon.com
} -----Original Message-----
} From: Bill Venables [mailto:venables@acland.qld.cmis.csiro.au]
} Sent: Friday, 30 July, 1999 02:56
} To: moord@email.uc.edu
} Cc: S-news@wubios.wustl.edu
} Subject: Re: [S] Help eliminate a recursive function
}
} David,
}
} Doing things in a loop can be rather costly in S, too, but if you
} need to generate all possible subsets sequentially here is *one*
} way to do it.
}
} (Code below; figuring out how it works, and improving it, remains
} an exercise for the casual onlooker.)
}
} It returns the next subset at each call, and NULL when the
} sequence is finished. Here is an example of how you might use it:
}
} > while(!is.null(m <- next.subset(5, 3)))
} {
} # `m' holds the index of the subset
} print(m)
} }
} [1] 1 2 3
} [1] 1 2 4
} [1] 1 2 5
} [1] 1 3 4
} [1] 1 3 5
} [1] 1 4 5
} [1] 2 3 4
} [1] 2 3 5
} [1] 2 4 5
} [1] 3 4 5
} >
}
} (Instead of "print(m)" you may wish to do something more
} interesting...)
}
} ---(cut here)---
} next.subset <- function(n, r) {
} if(exists(NR <- paste(n, r), frame = 0)) {
} nr <- get(NR, frame = 0)
} if(any(w <- (nr < (n-r+1):n))) {
} m <- max((1:r)[w])
} k <- m:r
} nr[k] <- k + nr[m] - m + 1
} }
} else {
} remove(NR, frame = 0)
} return(NULL)
} }
} }
} else
} nr <- 1:r
} assign(NR, nr, frame = 0)
} nr
} }
}
} cancel <- function(n, r)
} invisible(if(exists(nr <- paste(n, r), frame = 0))
} remove(nr, frame = 0))
} ---(cut here)---
}
} If you wish to re-start the sequence you should use the function
}
} > cancel(5, 3)
}
} to go back to the beginning.
}
} The function is just a way of presenting an algorithm. If you
} really are committed to using a very long loop in S code, you
} probably should build in the next.subset computation as part of
} the loop rather than calling a function every time, but there may
} not be very much in it, especially if each step is going to be a
} pretty large computation, anyway. This is the sort of territory
} where dynamically loaded code starts to become unavoidable.
}
} --
} -----------------------------------------------------------------
} Bill Venables, Statistician, CMIS Environmetrics Project.
}
} Physical address: Postal address:
} CSIRO Marine Laboratories, PO Box 120,
} 233 Middle St, Cleveland, Queensland Cleveland, Qld, 4163
} AUSTRALIA AUSTRALIA
}
} Telephone: +61 7 3826 7251 Email: Bill.Venables@cmis.csiro.au
} Fax: +61 7 3826 7304
} --------------------------------------------------------------
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|