s-news
[Top] [All Lists]

select rows which have at least a given number of duplicates

To: <s-news@lists.biostat.wustl.edu>
Subject: select rows which have at least a given number of duplicates
From: "vincent vinh-hung" <conrvhgv@az.vub.ac.be>
Date: Fri, 29 Sep 2006 15:23:13 +0200
Thread-index: AcbjymNOy6sJ0iHJQXCTisWPsoIW9w==
I would like to select rows from a table that 
have at least, say, 3 duplicates.

> m <- as.data.frame(c(1,1,4,3,2,3,1))
> m$Y <-c("a","b","c","d","e","f","g")
> names(m) <- c("X","Y")
> m
  X Y 
1 1 a
2 1 b
3 4 c
4 3 d
5 2 e
6 3 f
7 1 g

Command tabulate seems to do the job:
> m[tabulate(m$X)[m$X]>2,]
  X Y 
1 1 a
2 1 b
7 1 g

But tabulate is limited to integers.
Are there other ways that could be applied to reals
or to text?

Thanks in advance for any suggestion.

Vincent Vinh-Hung



<Prev in Thread] Current Thread [Next in Thread>