s-news
[Top] [All Lists]

Re: factor levels

To: <yiwu21111958@yahoo.com>, <s-news@lists.biostat.wustl.edu>
Subject: Re: factor levels
From: <Bill.Venables@csiro.au>
Date: Fri, 19 Jan 2007 15:27:38 +1000
Domainkey-signature: s=email; d=csiro.au; c=nofws; q=dns; b=h5+e+G9VIZTlz6ISYVWbbCWdtVrS0HOtuqxQuMO3wMZoM3LNV9YTJBH+hbTrwaIyPi1uSNul7E01s6Wvu4QNmaNyBT83GELOf2F7J/aHguwWYv7OfmeC3dDJwG3P69BT;
References: <20070119003842.30907.qmail@web57105.mail.re3.yahoo.com>
Thread-index: Acc7YkUny3baJYLuQqunoAvz0YDrsQAJSFLw
Thread-topic: [S] factor levels

 

 

Ye Yiwu asks:


From: s-news-owner@lists.biostat.wustl.edu [mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of yiwu ye
Sent: Friday, 19 January 2007 10:39 AM
To: s-news@lists.biostat.wustl.edu
Subject: [S] factor levels

 

Dear list,

I have the following problem. Could anyone help?

x <- as.factor(letters[1:5])
x <- x[-4]
levels(x)


The levels(x) still gives you 5 rather than 4 levels. Of course, I can do levels(as.factor(as.character(x)) to  get rid of the non-existing factor level. But, I think there must be a simpler way.

[WNV] it’s not clear that there is, or even should be, a simpler way.  It depends entirely on what you want to happen when you alter the factor by removing a component.

 

In principle a factor is used to define a classification.  There are two parts to it that in principle operate independently, namely the names of the classes into which the factor can define membership (i.e. the factor levels) and the membership of each candidate for which the factor does its job.  It’s not clear that if you remove one of the candidates, and by so doing make the membership of a particular group empty, that the levels should change to exclude it.

 

Even the solution you propose will not completely do what you want to happen, anyway, since it will re-order the levels so that their names are in sorted order.  If the original names were not in sorted order, you would have altered that as well.

 

I take it what you would like to happen is that when you make one or more of the levels of a factor empty the new factor has a reduces levels attribute.

 

Here is a function that will do this, but I emphasise that it is quite arbitrary:

 

fixFactor <- function(x) {

    newLevels <- levels(x)[table(x) > 0]

    factor(as.character(x), levels = newLevels)

}

 

Here is a brief demo

 

> x

 [1] a b c d e a b c d e

> table(x)  # levels in reverse alpha order

 e d c b a

 2 2 2 2 2

> y <- x[-c(1,6)]

> y  # note that the printout implicitly warns levels missing

[1] b c d e b c d e

Levels:

[1] "e" "d" "c" "b" "a"

> table(y)

 e d c b a

 2 2 2 2 0

> y <- fixFactor(y)

> table(y)

 e d c b

 2 2 2 2

> 



Thanks

Yiwu

 


Bill Venables
CMIS, CSIRO Laboratories,
PO Box 120, Cleveland, Qld. 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251

Fax (if absolutely necessary):    +61 7 3826 7304
Mobile (rarely used):                +61 4 1963 4642
Home Phone:                          +61 7 3286 7700
mailto:Bill.Venables@csiro.au
http://www.cmis.csiro.au/bill.venables/

<Prev in Thread] Current Thread [Next in Thread>