On Tue, 28 Nov 2000, Anne York wrote:
> Here is one method:
> Suppose your matrix is called dum:
> 1. convert the matrix into a vector of strings:
> dum.strings_apply(dum,1,paste,collapse="")
This is about what I'd recommend, but 2 changes can make it
more reliable and faster.
(reliability) I like to use "\001" as the separator so that it doesn't
confound, e.g., 12 3 and 1 23. (This assumes that "\001", control-A,
doesn't show up in data often).
(speed) Paste the columns together in one call to paste instead
of calling paste on each row. E.g.,
dum.list <- as.list(data.frame(dum))
names(dum.list)<-NULL
dum.strings <- do.call("paste", c(list(sep="\001"), dum.list))
(We take off the names of the data frame so paste() doesn't
interpret them as argument names).
The complete function would be
unique.rows1 <- function(dum) {
dum.list <- as.list(data.frame(dum))
names(dum.list) <- NULL
dum.strings <- do.call("paste", c(list(sep = "\001"), dum.list))
dum[match(unique(dum.strings), dum.strings), ]
}
You can add a column of counts with the following function
table.rows <- function(dum) {
dum.list <- as.list(data.frame(dum))
names(dum.list) <- NULL
dum.strings <- do.call("paste", c(list(sep = "\001"), dum.list))
tab <- table(factor(dum.strings, levels = unique(dum.strings)))
cbind(dum[match(unique(dum.strings), dum.strings), ], count = tab)
}
----------------------------------------------------------------------------
Bill Dunlap 22461 Mt Vernon-Big Lake Rd
Data Analysis Products Div. of MathSoft, Inc. Mount Vernon, WA 98274
bill@statsci.com 360-428-8146
"All statements in this message represent the opinions of the author and do
not necessarily reflect MathSoft policy or position."
|