I'll review my problem using a simple example. Consider two vectors,
x and y. The vector y contains all of x, plus additional data. One
twist is that both contain duplicate values, while the other twist is
that the order of x and y have been permuted so that the elements of
x are no longer contiguous within y. I need remove x from y in an
element-wise fashion. Here is an example:
x <- c(1, 2, 3, 1.5, 1.5, 8)
y <- c(2, 1.5, 1.5, 1.5, 1.5, 1.5, 2, 3, 9, 11)
The required results consists of the vector:
[1] 1.5, 1.5, 1.5, 2.0, 9.0, 11.0
My thanks to Sam Buttrey, Thomas Jagger, and Bill Dunlap for providing answers.
Thomas Jagger suggested using rle() and packaged it as a function,
complete with error/sanity checks:
rem.ainb <- function(a,b)
{
brl <- rle(sort(b)) #"contains all of a"
arl <- rle(sort(a))
ainb <- match(arl$values, brl$values, nomatch=NA)
if(any(is.na(ainb))) stop("a has at least one value that is not in b")
brl$lengths[ainb] <- brl$lengths[ainb] - arl$lengths
if(any(brl$lengths < 0))
stop("There are more duplicates for some value in a than this
value in b")
return(rep(brl$values, brl$lengths))
}
Both Sam and Bill used table() in their solutions. Sam also packaged
his solution as a function:
kim <- function(x, y)
{
# Make table of x and y combined, to preserve levels
tbl <- table(c(x, y))
#
# Reset "tbl" so it only has x's data in it
#
tbl[] <- 0
t2 <- tbl
# save this to use in a minute
x.tbl <- table(x)
tbl[names(x.tbl)] <- x.tbl
# Now fill "t2" with y's entries...
y.tbl <- table(y)
t2[names(y.tbl)] <- y.tbl
#...and subtract
t3 <- t2 - tbl
#
# Where t3 is positive, y had excess over x; where it's negative,
# x had excess and we don't care.
#
t3[t3 < 0] <- 0
return(as.numeric(rep(names(t3), t3)))
}
Bill shortened this quite a bit and reduced it to a total of four
lines. I've packaged it as a function here:
x.outof.y.fun <- function(x, y)
{
lvls <- sort(unique(y))
tx <- table(factor(x, levels = lvls))
ty <- table(factor(y, levels = lvls))
return(rep(lvls, ty-tx))
}
My thanks to all of you for your kind and timely assistance!
Kim Elmore
Kim Elmore, Ph.D.
University of Oklahoma
Cooperative Institute for Mesoscale Meteorological Studies
"All of weather is divided into three parts: Yes, No, and Maybe. The
greatest of these is Maybe" The original Latin appears to be garbled.
|