Thanks to Savitri Appana, Patrick Burns, Chuck Cleland, William Stewart, and Christopher L. Green for the nice range of answers to my problem (read below).
I am constantly amazed and thankful for the resource we have in the Splus community, and the interaction we have available with so many bright, approachable folks.
Several solutions were spot on, and a few took a slightly different tack, probably because I didn't ask the question quite precisely enough. But all are inventive, and I have taken them all apart to learn from them. I will just list the snippets here, in no particular order:
1)
y <- as.data.frame(cbind(sample(1:26, 100, replace=T), sample(letters,100, replace=T))) y[!duplicated(y),]
2)
unique(x[, c('id', 'names')])
3)
df <- data.frame(Name = rep(c('Mary','John','Tom','Anne'), each=4),Id = sample(4, 16, replace=T))
df$NameId <- interaction(df$Name, df$Id)
df[duplicated(df$NameId) == F,]
4)
temp.df <- data.frame(a=rep(c("one", "two", "three", "four"),30),b=rep(1:3,40)) temp.df[1:15,]
temp.ans <- unique(temp.df) temp.ans
# sort to make result easier to read temp.ans[order(temp.ans$a, temp.ans$b),]
5)
Sorting with no duplicates by 'var1' and 'var2': # Create index that shows which rows are to be kept: nondup.idx <- which( !duplicated(mydat[, c("var1", "var2")] ) ) mydat.nodup <- mydat[ nondup.idx, ]
*********************
Original Question:
Hello all,
I'm sure this is simple, but is has me stumped this morning...
I have a dataframe with 10K+ repeating names (character) each with an unique associated ID number. I need to simply extract a list of unique instances of these pairs. I can easily grab the unique vector of either one, but cannot think of how to get the other (conditional on uniqueness of the first).
Hope that is clear enough.
Thanks! *********************
Michael W. Slattery Geologist, Ohio EPA 50 West Town Street, Suite 700
Columbus OH, 43215 michael.slattery@epa.state.oh.us 614-728-1221 (Ph) 614-644-2909 (Fax) |