s-news
[Top] [All Lists]

Re: derive intervals a subject is taken the same dose

To: "Stuyckens, Kim [PRDBE]" <KSTUYCK1@PRDBE.JNJ.COM>
Subject: Re: derive intervals a subject is taken the same dose
From: David L Lorenz <lorenz@usgs.gov>
Date: Wed, 21 Jun 2006 11:54:57 -0500
Cc: "'s-news@lists.biostat.wustl.edu'" <s-news@lists.biostat.wustl.edu>, s-news-owner@lists.biostat.wustl.edu
In-reply-to: <D72CDC56EAC21D4A9698B069A30AC2290C3F8ACE@janbebeexs3.eu.jnj.com>

Kim,
  I think it is pretty easy if you use the diff() function. Here's code to start with and the output.

cbind(a[c(1,diff(a$DOSE)) != 0,],ENDDATE=a[c(diff(a$DOSE),1) != 0,3])
   ID DOSE       DATE    ENDDATE
 1  1 0.25 01/01/2006 01/02/2006
 3  1 0.50 01/03/2006 01/03/2006
 4  1 0.25 01/04/2006 01/05/2006
 6  1 2.50 01/06/2006 01/07/2006
 8  1 0.25 01/08/2006 01/08/2006
 9  2 2.50 01/09/2006 01/11/2006
12  2 3.50 01/12/2006 01/12/2006
13  2 2.50 01/13/2006 01/14/2006
15  2 2.00 01/15/2006 01/16/2006
17  2 4.00 01/17/2006 01/17/2006
18  3 2.00 01/18/2006 01/18/2006
19  3 2.50 01/19/2006 01/20/2006
21  3 2.00 01/21/2006 01/22/2006

  The first selection extracts the first value in any repeated sequence of DOSE, the second one extracts the DATE of the last in a sequence of repeated values of DOSE. I only checked for a change in DOSE, you will need to include a check for change in ID, using the same diff logic on ID and the | logical operator.
Dave


"Stuyckens, Kim [PRDBE]" <KSTUYCK1@PRDBE.JNJ.COM>
Sent by: s-news-owner@lists.biostat.wustl.edu

06/21/2006 09:35 AM

To
"'s-news@lists.biostat.wustl.edu'" <s-news@lists.biostat.wustl.edu>
cc
Subject
[S] derive intervals  a subject is taken the same dose





Dear S-News users,

I have a dataset which looks as follows containing thousands of rows:
ID      DOSE    DATE

1       0.25    01/01/2006

1       0.25    01/02/2006

1       0.50    01/03/2006

1       0.25    01/04/2006

1       0.25    01/05/2006

1       2.50    01/06/2006

1       2.50    01/07/2006

1       0.25    01/08/2006

2       2.50    01/09/2006

2       2.50    01/10/2006

2       2.50    01/11/2006

2       3.50    01/12/2006

2       2.50    01/13/2006

2       2.50    01/14/2006

2       2.00    01/15/2006

2       2.00    01/16/2006

2       4.00    01/17/2006

3       2.00    01/18/2006

3       2.50    01/19/2006

3       2.50    01/20/2006

3       2.00    01/21/2006

3       2.00    01/22/2006

a _data.frame(ID = c(1 ,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3),
                       
DOSE= c(.25,.25,.5,0.25,0.25,2.5,2.5,0.25,  2.5,2.5,2.5,3.5,2.5,2.5,2,2,4,2,2.5,2.5,2,2),
                       
DATE= timeDate(as.character(seq.dates("01/01/2006", by="days", length=22))) )

My problem is that I would like to find the dosing time intervals that a certain ID(subject in this case) takes the same dose. Mind that the same dose can return within the same ID at a later point in time.

The desired result looks as follows

ID      DOSE    STARTDATE       ENDDATE
1       0.25    01/01/2006      01/02/2006

1       0.50    01/03/2006      01/03/2006

1       0.25    01/04/2006      01/05/2006

1       2.50    01/06/2006      01/07/2006

1       0.25    01/08/2006      01/08/2006

2       2.50    01/09/2006      01/11/2006

2       3.50    01/12/2006      01/12/2006

2       2.50    01/13/2006      01/14/2006

2       2.00    01/15/2006      01/16/2006

2       4.00    01/17/2006      01/17/2006

3       2.00    01/18/2006      01/18/2006

3       2.50    01/19/2006      01/20/2006

3       2.00    01/21/2006      01/22/2006

I solved this problem until now with a loop to look for the moment when dose is changing, but I hope there is a more efficient way because the loop is taking to much time and gives memory problems  with the amount of data I have.

Any input or suggestions are very much appreciated

BTW I am using SPLUS 6.2 in a Windows environment

Thanks in advance,
Kind regards,

Kim

<Prev in Thread] Current Thread [Next in Thread>