Dear all,
Regarding my solution to the 'all possible combinations question'
The function dec2bin didn't do what it was supposed to do:
Here is the correct version:
dec2bin <- function(number, n)
{
out <- rep(0,n)
for (i in (n-1):0)
{
if (number >= 2^i)
{
out[i+1] <- 1
number <- number-2^i
}
}
out
}
Sorry about that
Bjarke
At 08:45 20-09-00 -0700, Ken Banks wrote:
>
>Dear All,
>
>Here is a summary of my post on 9-19-00 concerning all possible combinations
>of a 2^20 binary variable dataset. The original question was:
>
>
><<I need to produce a dataset containing all possible combinations of the on
>or off states of 20 binary variables. This should produce 2^20 or
>1,048,576 possible combinations. Does anyone have any hints for an
>efficient way to accomplish this task? Is it possible using SPlus? Any
>hints are greatly appreciated!>>
>
>
>Many thanks to John Gavin, Patrick Connolly, Nick Ellis, Tim Hesterberg,
>Charles Berry, Michael Ernst, Greg Snow, Peter Wolf, Scott Chasalow, Andy
>Liaw, Ronald Fehd, Matt Calder, Dennis Murphy, Ursula Becker, Bert Gunter,
>Daniel Heitjan, S.D. Beyers, and Bjarke Klein. I greatly appreciate all
>taking the time to reply to my question. Memory is still a problem for most
>solutions, which is related in part to my whimpy computer. Although I
>haven't produced the full dataset, the looping solution of Bjarke Klein may
>solve this problem.
>
>Thanks again!
>
>-Ken Banks.
>
>
>>From John Gavin,
>
>states <- function(n) expand.grid( lapply(seq(n), function(x) 0:1) )
>
>seems to work but seems very memory ineffecient.
>
>resources(ans <- states(20))
>> resources(ans <- states(20))
> secs working cache
> 6.65 190725264 9492
>
>
>
>>From Bjarke Klein
>Pick a random integer between 0 and 2^20 (both included)
>Convert this number to binary representation
>This will correspond to a series of 20 binary observations.
>
>I do not know whether this solves your problem
>
>I have included a code-example:
>
># Function to change from radix 10 to radix 2
># n must be the minimum n such that 2^n>number
>dec2bin <- function(number, n)
>{
> out <- rep(0,n)
> for (i in n:1)
> {
> if (number >= 2^i)
> {
> out[i] <- 1
> number <- number-2^i
> }
> }
>out
>}
>
># example:
>
># number corresponds to a combination
>number <- round(runif(1,0,2^20),0)
>
>number
>
>dec2bin(number,20)
>
>In this way you can generate the entire data set by looping through the
>numbers from 0 to 2^20
>
>from Patrick Connolly
>Just a hint. Try expand.grid().
>
>from Nick Ellis
>One way is with expand.grid, something like this:
>
>expand.grid(v1=0:1, v2=0:1, v3=0:1, <and so on until>, v20=0:1)
>
>This gives a dataframe with 20 columns and 2^20 rows.
>
>Another way is to calculate each of the 20 variables from the corresponding
>bit of the sequence 0:(2^20 - 1), something like this:
>
>bigseq <- 0:(2^20-1)
>data.frame(v1=bigseq %% 2, v2=(bigseq %/% 2) %% 2, v3=(bigseq %/% 4) %% 2,
><and so on until>, v20=(bigseq %/% (2^19)) %% 2)
>
>Of course there are slicker ways of doing this without writing things out 20
>times, e.g. (first method)
>
>n <- 20
>arglist <- vector("list",n)
>names(arglist) <- paste("v",1:n,sep="")
>for (i in 1:n) arglist[[i]] <- 0:1
>result <- do.call("expand.grid",arglist)
>
>and (second method)
>
>n <- 20
>bigseq <- 0:(2^n-1)
>bigmat <- matrix(bigseq,nrow=2^n,ncol=n)
>bigdiv <- matrix(2^(1:n - 1),nrow=2^n,ncol=n,byrow=T)
>result <- (bigmat %/% bigdiv) %% 2
>dimnames(result) <- list(NULL,paste("v",1:n,sep=""))
>
>>From Tim Hesterberg
>Here's an function to produce base-2 representations.
>You would use it as:
> base2(0:(2^19-1))
>
>base2 <- function(m)
>{
> ##Express m as powers of 2; m may be vector
> n <- length(m)
> maxi <- floor(log(max(m)/log(2)))
> result <- matrix( 0, n, maxi+1 )
> for(i in 1:(maxi+1)){
> result[,i] <- m %% 2
> m <- m %/% 2
> }
> result
>}
>
>Another approach uses cbind() and rep(..., each=2):
>
>y <- matrix(0:1,2)
>for(i in 1:5)
> y <- cbind(rbind(y,y), rep(0:1, each=2^i))
>y
>
>>From Charles Berry
>bits.20 <- outer(seq(0,length=20^2),2^(19:0),function(x,y) x %/% y %% 2)
>
>
>>From Michael Ernst
>The following function, from page 147 of the 3rd edition of VR, returns
>a matrix with each column containing the binary representation of each
>number in the vector "x" with "digits" number of digits.
>
>binary.v <-
>function(x, digits)
>{
> if(missing(digits)) {
> mx <- max(x)
> digits <- if(mx > 0) 1 + floor(log(mx, base = 2)) else 1
> }
> ans <- 0:(digits - 1)
> lx <- length(x)
> x <- rep(x, rep(digits, lx))
> x <- (x %/% 2^ans) %% 2
> dim(x) <- c(digits, lx)
> x
>}
>
>And binary.v(0:1048575,20) would give you what you want.
>
>
>>From Greg Snow
>try:
>
>temp <- list(A=0:1)
>for (i in 2:20) temp[[ LETTERS[i] ]] <- 0:1
>temp2 <- expand.grid( temp )
>dim(temp2)
>
>don't expect it to be fast, and you had better have a lot of memory.
>
>>From Peter Wolf
>
>here is an old solution to get the 2^n patterns as rows of a matrix
>
>comb<-function(n){
> comb<-NULL
> for( i in 1:n) comb<-rbind(cbind(1,comb),cbind(0,comb))
> comb
>}
>
>>From Scott Chasalow
>The following might work for you, though 2^20 is pushing it.
>
>Under Splus 4.5 for Windows, running under Win NT4 on a Pentium II 300Mhz
>with 128MB RAM:
>
>> hcube(rep(2, 3), translation = -1)
> [,1] [,2] [,3]
>[1,] 0 0 0
>[2,] 1 0 0
>[3,] 0 1 0
>[4,] 1 1 0
>[5,] 0 0 1
>[6,] 1 0 1
>[7,] 0 1 1
>[8,] 1 1 1
>
>> options(object.size = 20000000)
>> dos.time(hcube(rep(2, 2), translation = -1))
>[1] 0.01998901
>> dos.time(hcube(rep(2, 4), translation = -1))
>[1] 0.02999878
>> dos.time(hcube(rep(2, 8), translation = -1))
>[1] 0.04000854
>> dos.time(hcube(rep(2, 12), translation = -1))
>[1] 0.1400146
>> dos.time(hcube(rep(2, 14), translation = -1))
>[1] 0.5299988
>> dos.time(hcube(rep(2, 16), translation = -1))
>[1] 2.279999
>> dos.time(hcube(rep(2, 17), translation = -1))
>[1] 18.56
>> dos.time(hcube(rep(2, 18), translation = -1))
>Error in data[1:ll] <- old: Cannot allocate 37748960 bytes: options("object.
>size") is 20000000: see options help file
>Dumped
>Timing stopped at: 2.84
>
>It appears that my machine needs to start swapping at 2^17, so things really
>slow down. I think I'd have serious problems with a 2^20 x 20 matrix; I
>don't even want to try it. But on a fast machine with lots of RAM, it might
>be doable in Splus. Maybe hcube() could be made less memory-hungry too.
>
>If I HAD to do this in Splus, and I didn't need the whole matrix at once, I
>might be inclined to work with it in pieces, e.g. by cbinding hcube(rep(2,
>16), translation = -1) to the appropriate submatrices of the last four
>columns (which can be generated by replicating rows of hcube(rep(2, 4),
>translation = -1)).
>
>You can find the hcube() function in my combinat module,
>
>http://www.spg.wau.nl/pv/pub/chasalow/S/win/combinat
>
>or
>
>http://www.spg.wau.nl/pv/pub/chasalow/S/unix/combinat
>
>>From Andy Liaw
>
>Try expand.grid(x1,x2,x3,...), where x1, x2, ... are, for example, c(0,1).
>
>>From Ronald Fehd
>Here is a SAS program that will create your data set.
>
>%macro makedata(N);
>data B&N.;
>length default = 4;
>%DO I = 1 %TO &N.;
> do B&I = 0,1; %END;
>output;
>%DO I = 1 %TO &N.;
> end; %END;
>proc CONTENTS data = B&N.;
>run; %MEND;
>%MAKEDATA(3);
>
>>From Matt Calder
>
>If you use C this is easy, just test bits in the integers 0 - (2^21-1).
>Depending on the particulars of your problem, a simple .C call could give
>you
>what you want. Keep in mind that iterating over 2^21 values in S is going to
>be slow.
>
>>From Dennis Murphy
>Here's a version of Bill Venables' subsets program, which does what you're
>looking for:
>
>Subsets <- function(r, n, v = 1:n)
> ## S3/S4 version. Generate subsets, caching results in frame 0
> if (r <= 0) vector(mode(v), 0) else
> if (r >= n) v[1:n] else {
> if(r > 1) {
> i1 <- paste(".", n-1, r-1, sep="/")
> i2 <- paste(".", n-1, r, sep="/")
> if(!exists(i1, frame = 0))
> assign(i1, as.vector(Subsets(r-1, n-1, 1:(n-1))), frame = 0)
> if(!exists(i2, frame = 0))
> assign(i2, as.vector(Subsets(r, n-1, 1:(n-1))), frame = 0)
> m1 <- matrix(v[-1][get(i1, frame = 0)], ncol = r-1)
> m2 <- matrix(v[-1][get(i2, frame = 0)], ncol = r)
> } else {
> m1 <- NULL
> m2 <- matrix(v[2:n], ncol = 1)
> }
> rbind(cbind(v[1], m1), m2)
> }
>
>It's possible that swap space and time may be issues with this function
>given the number of pairs requested.
>
>>From Ursula Becker,
>the function fac.design should do this.
>assign
>fnames_list(one=c("0","1"), two=..., twenty=c("0","1")), then
>test_fac.design(rep(2,20),factor.names=fnames)
>
>Or if you don't care about the column names:
>fnames_rep(list(one=c("0","1")),20)
>
>
>>From Bert Gunter,
>Look in the S-Plus archives for Bill Venables' subsets() (or subset() ? )
>function. Or use mine (below). Finding all possible subsets is the same as
>producing all binary sequences. Note that both Bill's and my solutions use
>recursion, which is a very natural way to do this, but sometimes slow
>because it is not very efficiently implemented in S-Plus. For my
>application, this made no difference.
>
>>From Daniel Hietjan
>I would do it in Fortran. It will be many times faster.
>
>
>>From S.D. Beyers
>temp <- fac.design(levels=rep(2,20))
>
>
>will give you that matrix but you are throwing about a matrix as large as
>you create. A matrix 1M by 20 will always be big in Splus, whether you
>generate it in the fly or in one go.
>
>simplified form can be got thus
>
>result <- matrix(unlist(temp)-1, ncol=20, byrow=F)
>
>The above commands work on my machine and if nothing else are effient
>in terms of programming.
>
>
>-----------------------------------------------------------------------
>This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
>send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
>message: unsubscribe s-news
>
Bjarke Klein, Cand.Scient., Ph.D.-stud.
Department of Statistics and Demography
University of Southern Denmark
Phone: +45 65 50 36 07
http://www.statdem.sdu.dk/~bkl/
bjarke@statdem.sdu.dk
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|