Hello there,
I routinely work with huge data sets in S+, over 2GB, on a Unix machine
with 8GB of RAM. The best way I have found to import the data sets,
without using too much resources, is to pre-allocate the output data.frame
and then import the data by blocks using the function "readNextDataRows" .
I have read data sets of over 4M rows and 100 variables using this method.
I have adapted a function supplied by Insightful's tech. support to
automate the whole process. If anyone is interested I could email them the
function.
Have fun,
Gérald Jean
Analyste-conseil (statistiques), Actuariat
télephone : (418) 835-4900 poste (7639)
télecopieur : (418) 835-6657
courrier électronique: gerald.jean@dgag.ca
"In God we trust all others must bring data" W. Edwards Deming
|