David Smith wrote, helpfully as ever:
> The performance of the S computation engine -- i.e. the time to run
> expressions using the command line version -- should be about
> the same on an
> Intel machine whether it's running Windows or Linux. If
> you're observing
> otherwise, it may be because your .Data directory is on a
> remote-mounted
> disk (remote from the server machine where S-PLUS is running,
> that is).
> There can be a lot of random access to files in that folder
> which can slow
> down S-PLUS on some networks. I'd suggest running S-PLUS with a
> locally-mounted .Data folder and see if that makes a difference.
>
Aha! I've been puzzling intermittently for months why I was the only
person who sometimes (and sometimes not) had unreasonably slow data
access:
# make a moderate sized data frame with a single column
bigdf <- data.frame(runif(30000));
# extract the first column from it fairly quickly
time.start <- proc.time();
avec <- bigdf[,1];
cat(proc.time()-time.start,"\n");
# on the other hand, it takes ages to make a copy of the whole
data frame (ie the identical data)
time.start <- proc.time();
adf <- bigdf
cat(proc.time()-time.start,"\n");
On my competent modern Pentium under Windows 2000, connecting to a .Data
on a different machine over a local 100Mb/s ethernet, the first takes
0.7s but the second 20s; the slowdown is more severe for larger
datasets. But on a local .Data there is practically no difference.
It would be a help to me if David could expand on his knowledge that
remote .Data files "can slow down S-PLUS on some networks" here or on
the S-Plus website, which seems not to mention this.
Jonathan
-------------------------------------
Dr Jonathan Swinton
Proteom Ltd
Babraham Hall
Babraham
Cambridge CB2 4AT
tel: 01223 496180
fax: 01223 496181
jswinton@proteom.com
www.proteom.com
-----------------------------------
|