s-news
[Top] [All Lists]

Réf. : really large files

To: goldwater@schoolph.umass.edu
Subject: Réf. : really large files
From: gerald.jean@dgag.ca
Date: Thu, 16 Dec 2004 10:04:49 -0500
Cc: Chushu Gu <chushugu@hotmail.com>, s-news@lists.biostat.wustl.edu, s-news-owner@lists.biostat.wustl.edu
Hello there,

I routinely work with huge data sets in S+, over 2GB, on a Unix machine
with 8GB of RAM.  The best way I have found to import the data sets,
without using too much resources, is to pre-allocate the output data.frame
and then import the data by blocks using the function "readNextDataRows" .
I have read data sets of over 4M rows and 100 variables using this method.
I have adapted a function supplied by Insightful's tech. support to
automate the whole process.  If anyone is interested I could email them the
function.

Have fun,

Gérald Jean
Analyste-conseil (statistiques), Actuariat
télephone            : (418) 835-4900 poste (7639)
télecopieur          : (418) 835-6657
courrier électronique: gerald.jean@dgag.ca

"In God we trust all others must bring data"  W. Edwards Deming



<Prev in Thread] Current Thread [Next in Thread>