Many thanks to David Lorenz who suggested do.call
Bill Dunlap who suggested table() more general than tabulate.
David Pollard's web site also provided an example
adaptable as:
m$Y[match(m$X,names(table(m$X)[table(m$X)>2]),nomatch=0)>0]
With thanks,
Vincent
-----Original Message-----
From: s-news-owner@lists.biostat.wustl.edu
[mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of David L Lorenz
Sent: Friday, September 29, 2006 6:02 PM
To: vincent vinh-hung
Cc: s-news@lists.biostat.wustl.edu; s-news-owner@lists.biostat.wustl.edu
Subject: Re: [S] select rows which have at least a given number of duplicates
Vincent,
The tabulate() function will convert anything to factors and process those
data correctly. It is the subsetting [m$X] that must be
integer.
You can use the by() function and do.call("rbind", ) to do what you want:
do.call("rbind", by(m, m$X, function(x) if(nrow(x) > 2) x else NULL))
Take a look at the documentation for the functions to see what they do.
Dave
"vincent vinh-hung" <conrvhgv@az.vub.ac.be>
Sent by: s-news-owner@lists.biostat.wustl.edu
09/29/2006 08:23 AM To
<s-news@lists.biostat.wustl.edu>
cc
Subject
[S] select rows which have at least a given number of duplicates
I would like to select rows from a table that
have at least, say, 3 duplicates.
> m <- as.data.frame(c(1,1,4,3,2,3,1))
> m$Y <-c("a","b","c","d","e","f","g")
> names(m) <- c("X","Y")
> m
X Y
1 1 a
2 1 b
3 4 c
4 3 d
5 2 e
6 3 f
7 1 g
Command tabulate seems to do the job:
> m[tabulate(m$X)[m$X]>2,]
X Y
1 1 a
2 1 b
7 1 g
But tabulate is limited to integers.
Are there other ways that could be applied to reals
or to text?
Thanks in advance for any suggestion.
Vincent Vinh-Hung
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
|