s-news
[Top] [All Lists]

Re: Library for fractionating arbitrary factorial designs?

To: Joseph Davis <joe.davis@pdf.com>
Subject: Re: Library for fractionating arbitrary factorial designs?
From: Spencer Graves <spencer.graves@pdf.com>
Date: Thu, 17 Jun 2004 09:13:08 -0500
Cc: s-news@lists.biostat.wustl.edu
In-reply-to: <40D0A952.6040504@pdf.com>
References: <40D0A952.6040504@pdf.com>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
Hi, Joe: I hope your question generates a discussion; I'll offer a few comments here. First, we must be careful about terminology: I forget the details, but the standard "fractionation" techniques select subgroups or subalgebras of something like the integers modulo k for factors with k levels. Box and Wilson (1951) showed that when k>2, this can make third-order interactions estimable when the parabolic terms are not! Therefore, the literature since 1951 has tended to focus more on designs with various kinds of "optimality", e.g., maximizing the determinant or the trace or the smallest eigenvalue of the X'X matrix. These criteria are relatively easy to compute for any design. Unfortunately, we are usually more interested in design attributes like minimizing the variance of predictions or of individual regression coefficients in the regression equation. Box also said that a primary purpose of statistical methods is to catalyze the process of scientific thinking. For that purpose, he has used the concept of "projectivity", which has more recently been generalized to "estimation capacity". For your DoE, are your factors at 3 and 4 levels quantitative or purely qualitative? If qualitative, might there be a natural order? I generally prefer to treat ordered, qualitative factors as though they were quantitative, because I usually suspect that a linear term would more likely be important than a parabolic in almost any reasonable coding; otherwise, I must estimate at two coefficients for a main effect and at least two coefficients for each 2-factor interaction between a purely qualitative factor and any other factor. The next question is whether you have enough runs to estimate all the coefficients you want to estimate. If no, then you need a design with good projectivity and estimation capacity for the models most likely to be of interest. My recent work in this area has suggested that Taguchi's mixed-level orthogonal arrays are fairly good for these purposes. If you do have enough runs, then I recommend central composite designs or small composite designs or Hoke designs, depending on how many degrees of freedom you have to estimate lack of fit. Below, please find code I recently wrote to compute a Hoke design of any order. These designs allow estimation of a full quadratic with the minimum numbers of runs. If I have some 2-level and some 3-level factors, I may start with something like this and convert some of the 3-level factors to a 2-level factor. On other occasions, with some 4- or 5-level factors, I've started with something like a central composite design and modified it to increase the number of levels.
     Hope this helps.  spencer graves
##################################
#
#*******NOTE: SOME WEB BROWSERS CONVERT A CIRCUMFLEX CHARACTER FOR EXPONENTIATION #*******INTO A SUPERSCRIPT. THIS CAN MAKE THE CODE NONFUNCTIONING. #*******IF YOU HAVE TROUBLE WITH THIS, CONTACT ME OFFLINE, AND I'LL SEND THE FOLLOWING #*******IN A ZIP FILE. #*******
#******* TESTED IN R, NOT S-PLUS
#
# Hoke, Albert T.
# 1974
# Economical Second-Order Designs Based on Irregular Fractions of the 3^n Factorial
# Technometrics
# 16
# 375-384
#install.packages(c("AlgDesign", "fields", "conf.design"))
#library(AlgDesign)
#library(fields)
#library(conf.design)

i <- 0:8
base.k.number <- function(i, k=3){
 ndig <- ceiling(logb(max(i)+1, k))
 N <- length(i)
 i.k <- array(NA, dim=c(N, ndig))
 resid <- i # j=1
 for(j in 0:(ndig-1)){
   i.k[,ndig-j] <- resid %% k
   resid <- resid%/%k
 }
 if(N==1)as.vector(i.k) else i.k
}

Hoke.S.class <- function(n){
# all 3^n runs
# i = 0:(3^n-1)
# For each, j= # 2's in base 3 representation
#          (n-r)= # 1's in base 3 rep.
#           r= # 0's + 2's
# allPossibles i.k <- base.k.number(0:((3^n)-1))
# Classify by Sr.j
 N <- dim(i.k)[1]
 Sr.j <- array(NA, dim=c(N, 2))
 dimnames(Sr.j) <- list(NULL, c("r", "j"))
 Sr.j[,"r"] <- apply(i.k!=1, 1, sum)
 Sr.j[,"j"] <- apply(i.k==2, 1, sum)
 Sr.j. <- paste(Sr.j[,"r"], Sr.j[,"j"], sep=".")
# Count number in each class
 n.Sr.j <- table(Sr.j.)
# Done list(allPossibles=i.k, S.class=Sr.j., nrow.Sr.j=n.Sr.j)
}

Hoke.design <- function(n,
    treatment.names=if(n<6)LETTERS[1:n] else
       paste(LETTERS[1:n], ".", sep=""),
                       subsets=NULL){
# if(n=1) use all
# if(n=2) use (0,0),(2,1),(1,0),(2,2)
# if(n=3) use (0,0),
 if(n==1){
   X <- array(c(0, -1, 1), dim=c(3,1))
 }
 else{
   if(is.null(subsets)){
     if(n==2)
       subsets <- c("0.0","2.1", "1.0", "2.2")
     else {
       if(n==3)
         subsets <- c("0.0","3.1","2.0","3.2")
       else
         subsets <- paste(c(0, n, n-1, n),
                   c(0, n-1, 0, 2), sep=".")
     }
   }
   HokeS <- Hoke.S.class(n)
   sel <- (HokeS$S.class %in% subsets)
   X <- (HokeS$allPossibles[sel,]-1)
 }
 dimnames(X) <- list(NULL, treatment.names)
 X
}





Joseph Davis wrote:

Hello Snews,

I am building a DOE with up to 10 factors with a mix of 2,3,4 levels for each of the factors. Of course, I'd like to fractionate this design intelligently to get the results I need in the minimum number of runs.

Ideally, I'd like to be able to describe the model (or set of candidate models) I want to estimate from the DOE and have the SW pop out the minimum fractionated design or set of designs. The fractionate function in Splus only handled 2^k designs. Does anyone have or know of functions to accomplish this task either in Splus , stand alone, or some other package?

Thank you very much for your help!

regards,
jcd



<Prev in Thread] Current Thread [Next in Thread>