s-news
[Top] [All Lists]

Summary: Alignment of columns in a matrix

To: s-news@lists.biostat.wustl.edu
Subject: Summary: Alignment of columns in a matrix
From: Gabriel Baud-Bovy <gabriel@shaker.med.umn.edu>
Date: Fri, 15 Feb 2002 14:19:52 +0100
Thanks to Bill Dunlap for his anwer to my question. The question
and his solutions are reproduced below. Functions f0, f1, and f2
involve a loop. f3 uses lapply instead. For my real dataset with S2000pro,
f2 is slightly faster than f3 or f1 and considerably faster 
than f0.

Gabriel


------------ My question: 

Assuming a list made of vectors having different lentghs

aux<-list(1:10,4:9,2:6)

I want to align them after shifting each one of them
individually by

shift<-(6,1,4)

to obtain a matrix

1  2  3  4  5  6  7  8  9 10  NA
NA NA NA NA NA 4  5  6  7  8  9 
NA NA 2  3  4  5  6 NA NA  NA NA

i.e. the 6th element of the first vector is aligned with the 1st element of 
the 2nd vector and with the 4th element of the third vector.  

My original code, included in a function by Bill Dunlap, was:

f0 <- function(aux, shift)
{
        tmp <- lapply(aux, length)
        shift2 <- unlist(tmp) + max(shift) - shift
        for(i in 1:length(aux))
                aux[[i]] <- c(rep(NA, max(shift) - shift[i]), aux[[i]], rep(
                        NA, max(shift2) - shift2[i]))
        t(matrix(unlist(aux, use.names = F), max(shift2), length(aux)))
}

---------  Bill Dunlap'solutions:

I changed it to make the matrix first and 
insert a vector into each row

f1 <- function(aux, shift)
{
        tmp <- unlist(lapply(aux, length))
        shift1 <- max(shift) - shift + 1
        output <- array(as.double(NA), c(length(aux), max(tmp + shift1 - 1)))
        for(i in 1:length(aux))
                output[i, seq(shift1[i], len = tmp[i])] <- aux[[i]]
        output
}

They give identical results for your small datasets and for bigger ones
made with:

   shift<-sample(1:1000, size=400, replace=T)
   aux<-lapply(1:400,
          function(i)seq(sample(1:1000,size=1),len=sample(1:100,size=1)))

(which make a 400 by c. 1100 matrix, with the last value depending on the
random
sample you get).

In Splus6 f0(aux,shift) used 14075332 bytes of memory and 5.06 seconds and
f1(aux,shift) used 5387120 bytes (60% less) and 3.4 seconds (30% less).
In Splus4 f0 was slower (7.38 seconds) and f1 was faster (2.56 seconds)
and both used a more memory than they do in Splus6.

If you know that no length (tmp[i]) will be 0 then you can replace seq()
by : and get another speedup (down to 1.35 seconds in Splus6):

f2 <- function(aux, shift)
{
        tmp <- unlist(lapply(aux, length))
        shift1 <- max(shift) - shift
        output <- array(as.double(NA), c(length(aux), max(tmp + shift1)))
        for(i in 1:length(aux))
                output[i, (1:tmp[i]) + shift1[i]] <- aux[[i]]
        output
}

You can get a little more (1.19 seconds) by using matrix subscripts:

f3 <- function(aux, shift)
{
        tmp <- unlist(lapply(aux, length))
        shift1 <- max(shift) - shift
        output <- array(as.double(NA), c(length(aux), max(tmp + shift1)))
        ij <- cbind(rep(1:length(aux), tmp), unlist(lapply(1:length(aux),
                function(i, shift1, tmp)
        shift1[i] + 1:tmp[i], shift1, tmp)))
        output[ij] <- unlist(aux)
        output
}




<Prev in Thread] Current Thread [Next in Thread>
  • Summary: Alignment of columns in a matrix, Gabriel Baud-Bovy <=