s-news
[Top] [All Lists]

scanning large files

To: <s-news@lists.biostat.wustl.edu>
Subject: scanning large files
From: "Stephen Tallon" <s.tallon@irl.cri.nz>
Date: Wed, 19 Dec 2001 11:09:47 +1300
Importance: Normal
Hi,
 
S+6 rel 2, win2k, 256MB
 
I have large text files ~2GB (and in principle of indefinite size) consisting of sequential blocks of data of around 6000 points each. My system won't load a file this size (and no system could load an arbitrarily large file), but the calculations I wish to do only relate to each block of data and are not calculated across blocks of data, so in principle the calculations could be done in sequence. I was hopeful when reading that s+6 could use a map to the file instead of making a copy in memory of the data, but it appears there is still an initial surge in memory use to set it up, so it does not help me. The scan function appears to have improved under s+6 and the skip will actually skip data without consuming memory, but it takes an increasingly long time to scan through from the beginning to later parts of the file using skip, and in the long run is not a feasible solution for me either.
 
If there is a simple way to load these blocks of data in please let me know. If there isn't, it would be nice if future s+ could make use of a file position pointer that could be used across multiple calls to scan, so that it can jump straight back to where it finished the previous time.
 
My alternative is to to simply break the file up using other software, but it shouldn't have to be that way.
 
Thanks.
 
Steve
 
<Prev in Thread] Current Thread [Next in Thread>