>Does any one know of a reference that explains the algorithm used for the
>'minimal' option to the 'sample' command in S-Plus 7 and higher? The help
>page for 'sample' does not give a reference; nor does the source code for
>'sample.default'.
First a quick plug - sampling with minimal replacement would be useful
in many situations that people currently use sampling with replacement,
for example in SIR (sampling importance-resampling), to convert
a set of observations with unequal weights to a set with equal weights.
In the case of sampling with equal probabilities:
If 'minimal=TRUE', then
sampling is with minimal replacement--if 'size>n' then
each observation is included 'size %/% n' times,
then the remaining draws taken without replacement.
In the case of unequal probabilities:
If 'minimal=TRUE', then the probability of selecting observation 'j'
equals 'size*j', and every permutation of a set of outcomes has the
same probability of being chosen (assuming 'order=T'). The algorithm
randomly permutes the values and corresponding probabilities, divides
the unit interval into blocks of length proportional to 'prob', does a
systematic sample of 'size' random numbers uniformly on the unit
interval, selects the observations for which the uniform numbers fall
into the corresponding blocks, then does a final random permutation
(if 'order=TRUE'). "Minimal replacement" is not taken literally in
the unequal probability case - if 'size*max(prob)>1' than duplicates
may occur, and if '>2' then duplicates are guaranteed.
The above text is from the help file for the next version of S-PLUS.
If you have suggestions for improvement please let me know.
Tim Hesterberg
========================================================
| Tim Hesterberg Senior Research Scientist |
| timh@insightful.com Insightful Corp. |
| (206)802-2319 1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax) Seattle, WA 98109-3044, U.S.A. |
| www.insightful.com/Hesterberg |
========================================================
Download S+Resample from www.insightful.com/downloads/libraries
I'll teach short courses:
Advanced Programming in S-PLUS: San Antonio TX, March 26-27, 2008.
Bootstrap Methods and Permutation Tests: San Antonio, March 28, 2008.
Links for more info are at www.insightful.com/Hesterberg
|