Hi Win,
I’m writing in part to say “I
feel your pain”. I am working on a large-scale simulator: 10^6 –
10^8 simulated patient records to test some algorithms for analyzing
observational studies. “Memory leaks”, as these issues are
known in the trade, have been a big problem, and I think they’ve been
worse in the later versions. As far as I can tell from the documentation,
there is no way of purposefully invoking garbage collection from a script in
order to clean this up, though when the script ends the memory is clearly
freed. Maybe someone from TIBCO has a hint on this. I have resorted
to putting checkpoints in the script and simply restarting it from the last
checkpoint when it runs out of memory, once a day or so in my case.
After experimenting with this issue, there
is one useful piece of information I can offer: The rbind() and cbind() functions
seem to be particular culprits. To stick two or more data frames
together, the S-plus documentation advises you for performance reasons to build
a large data frame and then use assignments to put in the various pieces,
rather than use rbind() or cbind(). But rbind() is just so darn
convenient that I succumbed to the temptation to use it once or twice. In
looking into my memory leak issue, I found that rbind-ing two big data frames
together was a good way to lose a chunk of memory. Assigning row ranges
within a big frame made the problem better (but didn’t make it go away
entirely).
It’s easy enough to experiment with
various functions to see which ones are hurting you the most. Write a
loop that does a piece of your calculations, prints out memory.size(), and then
assigns all the variables to NULL. You can then see how the amount of
memory leakage differs as you make various changes to the code. As in the
case of rbind()/cbind(), there may be a way to rewrite the code to improve
things.
Hope this helps,
Alan
Alan Hochberg
VP, Research
ProSanos Corporation
225 Market St. Ste. 502,
Harrisburg, PA 17101
Tel
717-635-2124 * Fax 717-635-2575
From:
s-news-owner@lists.biostat.wustl.edu
[mailto:s-news-owner@lists.biostat.wustl.edu] On
Behalf Of Crawford.Winnie
Sent: Friday, February 20, 2009
5:00 PM
To: S-PLUS Newsgroup
Subject: [S] memory accumulation
All,
S-Plus
8 on Windows XP
I
am running into a memory problem with a script. After running it 3 or 4 times,
I get an error that there is not enough dynamic memory. I checked the help and
found the memory.size() function, so I ran it at the beginning of the S-Plus
session and after every script run. Upon opening S-Plus, I get a value of
7541984. After each run, I got the following:
8936128
9173952
9569904
9867680
The
last value was after the script crashed. When I tried to open the Language
Reference, S-Plus crashed. I opened it again and looked up options().
Under memory, it says:
memory
the maximum size (in bytes) for all
in-memory data. If this limit is exceeded, the session is terminated to avoid
runaway computations that may slow down or crash the computing system. You may
want to check for memory growth by calling the function memory.size
; if things get out of hand, quit and re-invoke S-PLUS.
The
last sentence surprised me. I thought there would be a function to clean up the
memory. It seems like quitting and re-invoking S-Plus is a little harsh. Has
anyone figured out a way to scrub the memory accumulation? Thanks for any help
you can provide.
Win
*****************************************************************
Winifred C. Crawford Staff Scientist/Senior
Meteorologist
ENSCO, Inc.
Aerospace Sciences and Engineering Division
1980 N. Atlantic
Ave., Suite 830
Cocoa Beach, FL 32931
VOICE: 321.853.8130 FAX: 321.853.8415
EMAIL: crawford.winnie@ensco.com
AMU Quarterly Reports are available online:
http://science.ksc.nasa.gov/amu
*****************************************************************
The information contained in this email
message is intended only for the use of the individual(s) to whom it is
addressed and may contain information that is privileged and sensitive. If you
are not the intended recipient, or otherwise have received this communication
in error, please notify the sender immediately by email at the above referenced
address and note that any further dissemination, distribution or copying of this
communication is strictly prohibited.
The U.S. Export Control Laws regulate the export and re-export of technology
originating in the United
States. This includes the electronic
transmission of information and software to foreign countries and to certain
foreign nationals. Recipient agrees to abide by these laws and their
regulations -- including the U.S. Department of Commerce Export Administration
Regulations and the U.S. Department of State International Traffic in Arms
Regulations -- and not to transfer, by electronic transmission or otherwise,
any content derived from this email to either a foreign national or a foreign
destination in violation of such laws.