s-news
[Top] [All Lists]

Re: Categorization and use of if

To: "Patricia Farra" <patricia.farra@rogers.com>
Subject: Re: Categorization and use of if
From: John Fox <jfox@mcmaster.ca>
Date: Mon, 10 Jun 2002 18:53:23 -0400
Cc: s-news@lists.biostat.wustl.edu
In-reply-to: <012701c210c4$05e19720$166a2a18@bloor.phub.net.cable.rogers .com>
Dear Patricia,

At 05:16 PM 6/10/2002 -0400, Patricia Farra wrote:

1) var1<-c(5.1,4.9,4.7,4.6,5.0,5.4,4.6,5.0,4.4,4.9,5.4,4.8,4.8,
4.3,5.8, 5.7,5.4,5.1,5.7,5.1,5.4,5.1,4.6,5.1,4.8,5.0,5.0,5.2,5.2,4.7,
4.8,5.4,5.2,5.5,4.9,5.0,5.5,4.9,4.4,5.1,5.0,4.5,4.4,5.0,5.1,4.8,5.1, 4.6,5.3,5.0,7.0,6.4,6.9,5.5,6.5,5.7,6.3,4.9,6.6,5.2,5.0,5.9,6.0, 6.1,5.6,6.7,5.6,5.8,6.2,5.6,5.9,6.1,6.3,6.1,6.4,6.6,6.8,6.7,6.0, 5.7,5.5,5.5,5.8,6.0,5.4,6.0,6.7,6.3,5.6,5.5,5.5,6.1,5.8,5.0,5.6, 5.7,5.7,6.2,5.1,5.7,6.3,5.8,7.1,6.3,6.5,7.6,4.9,7.3,6.7,7.2,6.5, 6.4,6.8,5.7,5.8,6.4,6.5,7.7,7.7,6.0,6.9,5.6,7.7,6.3,6.7,7.2,6.2, 6.1,6.4,7.2,7.4,7.9,6.4,6.3,6.1,7.7,6.3,6.4,6.0,6.9,6.7,6.9,5.8, 6.8,6.7,6.7,6.3,6.5,6.2,5.9)

I would like to split var1 into:
    a) equal-width-bins
    b) 7 bins


If you want seven equal-width bins, cut(var1, 7) will do the trick, producing a "category" object as a result; as.numeric(cut(var1, 7)) produces a simple numeric vector.

2) I manually categorize var1 into 7 categories and get
x1<-c(3,2,2,2,3,3,2,3,2,2,3,2,2,1,4,4,3,3,4,3,3,3,2,3,2,3,3,3,
3,2,2,3,3,3,2,3,3,2,1,3,3,2,1,3,3,2,3,2,3,3,6,5,6,3,5,4,5,2,5,
3,3,4,4,5,4,6,4,4,5,4,4,5,5,5,5,5,6,6,4,4,3,3,4,4,3,4,6,5,4,3,
3,5,4,3,4,4,4,5,3,4,5,4,6,5,5,7,2,7,6,7,5,5,6,4,4,5,5,7,7,4,6,
4,7,5,6,7,5,5,5,7,7,7,5,5,5,7,5,5,5,6,6,6,4,6,6,6,5,5,5,4)


Your result is a bit different from that produced by cut.

I tried this
> x1<-if(var1==1) 1000000 else
+ if(var1==2) 0100000 else
+ if(var1==3) 0010000 else
+ if(var1==4) 0001000 else
+ if(var1==5) 0000100 else
+ if(var1==6) 0000010 else
+ 0000001

and got

> x1
[1] 10000

Expected result: 0010000,0100000,0100000,0100000,etc...


There are a few problems here. First, "if" is not vectorized; you can use "ifelse" instead. Second, since the result is numeric, leading zeroes will not appear. If you need them, you could use character values for the result. Third, the categorized version of your variable is x1, not var1:

    x1 <- as.numeric(cut(var1, 7))

    x1 <- ifelse(x1==1, '1000000',
                ifelse(x1==2, '0100000',
                ifelse(x1==3, '0010000',
                ifelse(x1==4, '0001000',
                ifelse(x1==5, '0000100',
                ifelse(x1==6, '0000010', '0000001'))))))

I hope that this helps,
 John


-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox@mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
-----------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>