Thank you very much to all of you who replied to my question. From
the replies I learned about the functions rle() and duplicated(). Thanks
very much to those who included code to solve the problems of detecting
long runs: finding their length, finding the values associated with them,
and finding the positions at which they start. I'm copying in two of these
examples, using rle().
EXAMPLE #1 USING rle(): FINDS LONG SEQUENCES AND THE VALUES ASSOCIATED
WITH THEM:
# generate some data
k <- 6
test <- c(1, 7, 3, 8, 8, 8, 2, 2, 2, 1, 1, 2, 1, 4, 4, 4, rep(5, k), 9, 9,
rep(7, k + 4), 8 ,8)
# run length encoding
rle.test <- rle(test)
# index of sequences longer or equal to k
idx <- rle.test$length >= k
# get corresponding values
rle.test$values[idx]
EXAMPLE #2 USING rle(): FINDS POSIITIONS AT WHICH LONG SEQUENCES BEGIN
>longRunsAt <- function(x, k) {
# where are runs in x of length >= k?
z <- rle(x)
i <- z$lengths >= k
cumsum(c(1, z$lengths[ - length(z$lengths)]))[i]
}
>longRunsAt(c(1,2,3,3,3,3,4,4,1,1,1,1,2,2), k=4)
[1] 3 9
|