On 9/19/06, Eric yang <yang_eric9@yahoo.com> wrote:
Dear all,
I would like to know a fast way of reducing a data set based on certain
conditions. For example, suppose I have the following data set
my.data <- data.frame(ID=c(rep("101-1", 10), rep("102-12", 14),
rep("103-10", 3), rep("104-2", 8)), score=round(100*runif(35)))
> my.data
ID score
1 101-1 85
2 101-1 32
3 101-1 22
4 101-1 74
5 101-1 48
6 101-1 47
7 101-1 46
8 101-1 6
9 101-1 58
10 101-1 37
11 102-12 16
12 102-12 78
13 102-12 15
14 102-12 45
15 102-12 99
16 102-12 4
17 102-12 99
18 102-12 35
19 102-12 78
20 102-12 16
21 102-12 91
22 102-12 34
23 102-12 10
24 102-12 20
25 103-10 43
26 103-10 12
27 103-10 57
28 104-2 86
29 104-2 45
30 104-2 85
31 104-2 81
32 104-2 9
33 104-2 40
34 104-2 47
35 104-2 74
I would like to reduce the data set such that I have at most the top 5
scores for each ID number. Thus, I would end up with the following data set:
1 101-1 85
4 101-1 74
5 101-1 48
6 101-1 47
9 101-1 58
12 102-12 78
15 102-12 99
17 102-12 99
19 102-12 78
21 102-12 91
25 103-10 43
26 103-10 12
27 103-10 57
28 104-2 86
30 104-2 85
31 104-2 81
34 104-2 47
35 104-2 74
Thanks for any help in advance.
my.data
ID score
1 101-1 79
2 101-1 53
3 101-1 1
4 101-1 49
5 101-1 84
6 101-1 14
7 101-1 17
8 101-1 90
9 101-1 99
10 101-1 85
11 102-12 16
12 102-12 79
13 102-12 9
14 102-12 86
15 102-12 43
16 102-12 89
17 102-12 55
18 102-12 33
19 102-12 93
20 102-12 93
21 102-12 61
22 102-12 80
23 102-12 40
24 102-12 36
25 103-10 35
26 103-10 85
27 103-10 98
28 104-2 22
29 104-2 96
30 104-2 44
31 104-2 57
32 104-2 1
33 104-2 76
34 104-2 24
35 104-2 11
do.call("rbind", lapply(split(my.data, my.data$ID), function(fr)
fr[rev(order(fr$score)),][1:min(5, nrow(fr)),]))
ID score
101-1.9 101-1 99
101-1.8 101-1 90
101-1.10 101-1 85
101-1.5 101-1 84
101-1.1 101-1 79
102-12.20 102-12 93
102-12.19 102-12 93
102-12.16 102-12 89
102-12.14 102-12 86
102-12.22 102-12 80
103-10.27 103-10 98
103-10.26 103-10 85
103-10.25 103-10 35
104-2.29 104-2 96
104-2.33 104-2 76
104-2.31 104-2 57
104-2.30 104-2 44
104-2.34 104-2 24
|