| To: | "vincent vinh-hung" <conrvhgv@az.vub.ac.be> |
|---|---|
| Subject: | Re: select rows which have at least a given number of duplicates |
| From: | David L Lorenz <lorenz@usgs.gov> |
| Date: | Fri, 29 Sep 2006 11:02:08 -0500 |
| Cc: | s-news@lists.biostat.wustl.edu, s-news-owner@lists.biostat.wustl.edu |
| In-reply-to: | <200609291310.k8TDAmC2010603@pluto.az.vub.ac.be> |
|
Vincent, The tabulate() function will convert anything to factors and process those data correctly. It is the subsetting [m$X] that must be integer. You can use the by() function and do.call("rbind", ) to do what you want: do.call("rbind", by(m, m$X, function(x) if(nrow(x) > 2) x else NULL)) Take a look at the documentation for the functions to see what they do. Dave
I would like to select rows from a table that have at least, say, 3 duplicates. > m <- as.data.frame(c(1,1,4,3,2,3,1)) > m$Y <-c("a","b","c","d","e","f","g") > names(m) <- c("X","Y") > m X Y 1 1 a 2 1 b 3 4 c 4 3 d 5 2 e 6 3 f 7 1 g Command tabulate seems to do the job: > m[tabulate(m$X)[m$X]>2,] X Y 1 1 a 2 1 b 7 1 g But tabulate is limited to integers. Are there other ways that could be applied to reals or to text? Thanks in advance for any suggestion. Vincent Vinh-Hung -------------------------------------------------------------------- This message was distributed by s-news@lists.biostat.wustl.edu. To unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with the BODY of the message: unsubscribe s-news |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | select rows which have at least a given number of duplicates, vincent vinh-hung |
|---|---|
| Previous by Thread: | select rows which have at least a given number of duplicates, vincent vinh-hung |
| Indexes: | [Date] [Thread] [Top] [All Lists] |