Thanks to Bill Dunlap for his anwer to my question. The question
and his solutions are reproduced below. Functions f0, f1, and f2
involve a loop. f3 uses lapply instead. For my real dataset with S2000pro,
f2 is slightly faster than f3 or f1 and considerably faster
than f0.
Gabriel
------------ My question:
Assuming a list made of vectors having different lentghs
aux<-list(1:10,4:9,2:6)
I want to align them after shifting each one of them
individually by
shift<-(6,1,4)
to obtain a matrix
1 2 3 4 5 6 7 8 9 10 NA
NA NA NA NA NA 4 5 6 7 8 9
NA NA 2 3 4 5 6 NA NA NA NA
i.e. the 6th element of the first vector is aligned with the 1st element of
the 2nd vector and with the 4th element of the third vector.
My original code, included in a function by Bill Dunlap, was:
f0 <- function(aux, shift)
{
tmp <- lapply(aux, length)
shift2 <- unlist(tmp) + max(shift) - shift
for(i in 1:length(aux))
aux[[i]] <- c(rep(NA, max(shift) - shift[i]), aux[[i]], rep(
NA, max(shift2) - shift2[i]))
t(matrix(unlist(aux, use.names = F), max(shift2), length(aux)))
}
--------- Bill Dunlap'solutions:
I changed it to make the matrix first and
insert a vector into each row
f1 <- function(aux, shift)
{
tmp <- unlist(lapply(aux, length))
shift1 <- max(shift) - shift + 1
output <- array(as.double(NA), c(length(aux), max(tmp + shift1 - 1)))
for(i in 1:length(aux))
output[i, seq(shift1[i], len = tmp[i])] <- aux[[i]]
output
}
They give identical results for your small datasets and for bigger ones
made with:
shift<-sample(1:1000, size=400, replace=T)
aux<-lapply(1:400,
function(i)seq(sample(1:1000,size=1),len=sample(1:100,size=1)))
(which make a 400 by c. 1100 matrix, with the last value depending on the
random
sample you get).
In Splus6 f0(aux,shift) used 14075332 bytes of memory and 5.06 seconds and
f1(aux,shift) used 5387120 bytes (60% less) and 3.4 seconds (30% less).
In Splus4 f0 was slower (7.38 seconds) and f1 was faster (2.56 seconds)
and both used a more memory than they do in Splus6.
If you know that no length (tmp[i]) will be 0 then you can replace seq()
by : and get another speedup (down to 1.35 seconds in Splus6):
f2 <- function(aux, shift)
{
tmp <- unlist(lapply(aux, length))
shift1 <- max(shift) - shift
output <- array(as.double(NA), c(length(aux), max(tmp + shift1)))
for(i in 1:length(aux))
output[i, (1:tmp[i]) + shift1[i]] <- aux[[i]]
output
}
You can get a little more (1.19 seconds) by using matrix subscripts:
f3 <- function(aux, shift)
{
tmp <- unlist(lapply(aux, length))
shift1 <- max(shift) - shift
output <- array(as.double(NA), c(length(aux), max(tmp + shift1)))
ij <- cbind(rep(1:length(aux), tmp), unlist(lapply(1:length(aux),
function(i, shift1, tmp)
shift1[i] + 1:tmp[i], shift1, tmp)))
output[ij] <- unlist(aux)
output
}
|