I have two dataframes, df.1 and df.2. Each dataframe has the variables
subjectID and studyday. I want to create a flag in df.1 that indicates
whether the studyday in df.1 is in a 28 day period starting at studyday in
df.2 for an individual patient.
I have included some simple code to create the data structures and also
attached two manually annotated dataframes that mark which observations
should be flagged.
Two possible approaches I have used:
1. I can do this through nested "for" loops where the external loop subsets
df.1by subject and the internal loop compares the individual records to df.2
2. I have used by() to loop through subjects in df.1 and within used
apply() to cycle through the individual subject observations and compare
against df.2.
Both are prohibitively slow, and I know there must be a better method. My
data is usually of moderate size (df.1 ~ 30000 observations, df.2 ~ 5000
observations).
df.1 <- data.frame( subjectID = rep( 1:20, each = 10),
studyday = unlist( lapply( 1:20, function( x ) c( 1,
sort( sample( 2:100, 9 ))))))
df.2 <- data.frame( subjectID = sort(sample( unique( df.1$subjectID ), 20,
replace = T)),
studyday = sample( 1:100, 20, replace = T ) )
> df.1[1:20, ]
subjectID studyday
1 1 1
2 1 26* occurs within 28 day period starting at day 22
observation in df.2
3 1 36* occurs within 28 day period starting at day 22
observation in df.2
4 1 47* occurs within 28 day period starting at day 22
observation in df.2
5 1 62
6 1 68
7 1 76
8 1 79
9 1 87
10 1 92
11 2 1
12 2 45* occurs within 28 day period starting at day 36
observation in df.2
13 2 46* occurs within 28 day period starting at day 36
observation in df.2
14 2 47* occurs within 28 day period starting at day 47
observation in df.2
15 2 53* occurs within 28 day period starting at day 47
observation in df.2
16 2 69* occurs within 28 day period starting at day 47
observation in df.2
17 2 76
18 2 84
19 2 89
20 2 93
> df.2[1:10, ]
subjectID studyday
1 1 22
2 2 36
3 2 47
4 5 17
5 5 59
6 7 23
7 8 17
8 8 34
9 9 42
10 9 36
Thanks for any advice,
--Matt
|