Special thanks to Bob Pruzek, whose success on a small Apple G4
prompted me to try this in R.
> A <- matrix(rnorm(3000000),100,30000)
> date()
[1] "Sun Jul 24 11:28:21 2005"
> S <- svd(A,nu=0)
> date()
[1] "Sun Jul 24 11:28:27 2005"
> dim(S$v)
[1] 30000 100
Works like a charm in 6 seconds.
Thanks also to Brian Ripley and Bill Dunlop for their replies.
The question now is "Shouldn't this work with Splus for Windows?"
Judging from the R code, these all use the same calls to LAPACK or
LINPACK.
Joe Verducci
______________________________________________________
> Joe,
> For the heck of it I decided to try your problem
> on my little Apple
> Powerbook G4 (512 Mb ram). The computation not only
> worked, but did so (correctly?) in less than 40
> seconds):
>
> A<-matrix(rnorm(3000000),30000,100)
>> dim(A)
> [1] 30000 100
>> date()
> [1] "Sat Jul 23 23:01:39 2005"
>> svd(A,nu=0)$d
> [1] 183 182 182 182 181 181 181 180 180 180 180 179
> 179 179
> middle snipped...
> [91] 166 166 166 166 166 165 165 165 164 164
> date()
> [1] "Sat Jul 23 23:02:17 2005"
> ....For what it's worth...Bob Pruzek
> (SUNY Albany)
>
> --- jsv@stat.ohio-state.edu wrote:
>
>> Dear Prof. Ripley:
>> Here is what I get
>> > A <- matrix(rnorm(3000000),100,30000)
>> > S <- svd(A,nu=0)
>> Problem in 1:ll: Cannot create - data would have
>> length greater than 536
>> 870911 (.Machine$integer.max/sizeof(integer))
>> Use traceback() to see the call stack
>> > traceback()
>> 10: eval(action, sys.parent())
>> 9: doErrorAction("Problem in 1:ll: Cannot create -
>> data would have lengt
>> h greater than 536870911
>> (.Machine$integer.max/sizeof(integer))",
>> 8: 1:ll
>> 7: array(0., c(p, p))
>> 6: .Fortran(if(!cmplx) "dsvdcs" else "zsvdcs",
>> 5: .Fortran(if(!cmplx) "dsvdcs" else "zsvdcs",
>> 4: svd.default(A, nu = 0)
>> 3: svd(A, nu = 0)
>> 2: eval(expression(S <- svd(A, nu = 0)))
>> 1:
>> Message: Problem in 1:ll: Cannot create - data would
>> have length greater
>> than 536870911
>> (.Machine$integer.max/sizeof(integer))
>> I interpreted line 7 from the traceback as svd's
>> attempt to
>> allocate a 30K x 30K intermediate result matrix.
>> This is corroborated
>> by the fact that the same message appears in the
>> allocation
>> > A <- matrix(0,30000,30000)
>> Problem in 1:ll: Cannot create - data would have
>> length greater than 536
>> 870911 (.Machine$integer.max/sizeof(integer))
>> Use traceback() to see the call stack
>> I think that construction of a full symmetric
>> matrix may be part of a
>> computationally fast algorithm, but I would be
>> willing to trade some
>> extra time to do the computation with less space.
>> I am using "svd" that is included in the initial
>> loading of Splus.
>> Thanks for any further pointers you might provide.
>> Joe Verducci
>> ______________________________
>> > What do you want from the svd? In a M x N system
>> the left singular
>> > vectors form a M x M system, taking up much more
>> space that the 2Gb
>> > Windows will allow S-PLUS. By default svd()
>> avoids this by giving you
>> > only N (for M > N) components of the singular
>> vectors.
>> >
>> > You will probably do better to do things in two
>> steps, for example
>> >
>> > A <- matrix(rnorm(30), 10, 3)
>> > S <- svd(A, nu=0)
>> > S$u <- A %*% S$v / rep(S$d, each=nrow(A))
>> >
>> > Now, if finding S is too slow (and it should not
>> be with LAPACK-based
>> > code, so you might want to try the Matrix library)
>> you can get the
>> > components (less accurately) from
>> eigen(crossprod(A)),
>> >
>> > E <- eigen(crossprod(A))
>> > S <- list(d=sqrt(E$values), v=E$vectors)
>> > S$u <- A %*% S$v / rep(S$d, each=nrow(A))
>> >
>> > My 1GB laptop swapped rather a lot, but was able
>> to do that in less than a
>> > minute CPU.
>> >
>> > On Fri, 22 Jul 2005 jsv@stat.ohio-state.edu wrote:
>> >
>> >> I'm using Splus 7.0 on a Windows XP with 4Gb
>> memory. I would like to
>> >> perform an svd on a rectangular array where one
>> dimension is very large
>> >> (around 30K) and the other is relatively small
>> (around 100).
>> >> Unfortunately, svd is not one of the functions
>> listed to take advantage
>> >> of the out-of-memory mode in version 7.0. Right
>> now my feasible limits
>> >> in-memory are (8K x 100). It should be possible
>> to use program an SVD
>> >> for rectangular arrays using sequential
>> optimization, but before going
>> >> in that route, I am hoping someone here knows of
>> an available code.
>> >
>> > --
>> > Brian D. Ripley,
>> ripley@stats.ox.ac.uk
>> > Professor of Applied Statistics,
>> http://www.stats.ox.ac.uk/~ripley/
>> > University of Oxford, Tel: +44 1865
>> 272861 (self)
>> > 1 South Parks Road, +44 1865
>> 272866 (PA)
>> > Oxford OX1 3TG, UK Fax: +44 1865
>> 272595
>> >
> --------------------------------------------------------------------
>> > This message was distributed by
>> s-news@lists.biostat.wustl.edu. To
>> > unsubscribe send e-mail to
>> s-news-request@lists.biostat.wustl.edu with
>> > the BODY of the message: unsubscribe s-news
>> >
> --------------------------------------------------------------------
>> This message was distributed by
>> s-news@lists.biostat.wustl.edu. To
>> unsubscribe send e-mail to
>> s-news-request@lists.biostat.wustl.edu with
>> the BODY of the message: unsubscribe s-news
>
|