s-news
[Top] [All Lists]

Re: split data set by two variables

To: Zheng_Jenny <Zheng_Jenny@Allergan.com>
Subject: Re: split data set by two variables
From: Sebastien Bihorel <Sebastien.Bihorel@cognigencorp.com>
Date: Wed, 21 Jan 2009 14:37:45 -0500
Cc: s-news@wubios.wustl.edu
In-reply-to: <37A0A965AB353B4D93AE86C2F16FF3825690DA@IRMAIL134.irvine.allergan.com>
References: <37A0A965AB353B4D93AE86C2F16FF3825690DA@IRMAIL134.irvine.allergan.com>
User-agent: Thunderbird 2.0.0.9 (Windows/20071031)
Hi Jenny,

Let's say that your data are in a data frame called mydf. You could split this data frame in a list object with x (the total number of different combination of V2 and V3 elements) levels as descibed below. Each level will contain one subset of your original data frame. After that, you can do what you want with this list.

split.factor <- do.call(interaction,c(mydf[,c(2.3)],sep=":"))
mydf.split <- split(mydf,split.factor)

Hope it helps

Sebastien Bihorel, PharmD, PhD
PKPD Scientist
Cognigen Corp
Email: sebastien.bihorel@cognigencorp.com
Phone: (716) 633-3463 ext. 323


Zheng_Jenny wrote:

Hi, Dear all,

 

I have dataset including 4columns and I would like to subset the data set into 16 tables based on V2 and V3 .  The data is given below:

V1  V2 V3    V4

 1  A 100 -0.276080217

 2  D 100 -0.755085249

 3  D 300  0.070568229

 4  D 100 -0.034123587

 5  C 300 -0.783312701

 6  B 200 -0.009916936

 7  C 100  0.121634851

 8  A 300  1.179237171

 9  A 300 -0.082016520

10  D 100 -0.354451950

11  C 400 -0.380629658

12  B 100 -1.086790089

13  B 400 -1.193285781

14  B 100  1.256262475

15  A 300  2.601771332

16  B 400  0.486331159

17  C 100 -0.657802222

18  D 200  0.187357840

19  D 100  0.984822191

20  B 100  0.993325648

21  C 300 -0.122128302

22  A 200  0.653194345

23  D 200 -0.941139519

24  B 300 -0.962417052

25  B 200 -0.868492638

26  D 400 -1.369687541

27  A 300 -1.137516488

28  B 100  1.257390239

29  B 400  0.717791479

30  A 200 -0.668566890

31  C 200  0.406858215

32  C 300  0.082724360

33  A 300 -0.083015690

34  C 200 -0.040329438

35  C 200  1.308646392

36  C 200 -0.481138082

37  A 100  0.691464187

38  D 300 -0.843482695

39  C 200 -0.625104873

40  A 400 -0.141905432

 

I would like to have table 1 include V2=A and V3=100, table 2 including V2=A and V3=200, and so on.  What would be most effective way of doing that?  Thanks,

 

Jenny

 

 

This e-mail, including any attachments, is meant only for the intended recipient and may be a confidential communication or a communication privileged by law. If you received this e-mail in error, any review, use, dissemination, distribution, or copying of this e-mail is strictly prohibited. Please notify the sender immediately of the error by return e-mail and please delete this message from your system. Thank you in advance for your cooperation.

<Prev in Thread] Current Thread [Next in Thread>