s-news
[Top] [All Lists]

Re: running out of dynamic memory when using predict.bdGlm

To: "'s-news@lists.biostat.wustl.edu'" <'s-news@lists.biostat.wustl.edu'>
Subject: Re: running out of dynamic memory when using predict.bdGlm
From: Marc Pelath <Marc.Pelath@qrm.com>
Date: Mon, 3 Dec 2007 10:14:02 -0600
Accept-language: en-US
Acceptlanguage: en-US
In-reply-to: <E3B3028217BCB5418E73AAB66D74DF270212565F86@Exchange2007.qrm.com>
References: <E3B3028217BCB5418E73AAB66D74DF270212565F86@Exchange2007.qrm.com>
Thread-index: Acg1xDLhQ9iIFBhQQuS7Oj3dw8wDKAAAqUtQ
Thread-topic: running out of dynamic memory when using predict.bdGlm
FYI, this was resolved by closing and reopening SPLUS, apparently to free up 
some memory used in the original estimation.  Thanks to Wayne Thogmartin at the 
USGS.

From: s-news-owner@lists.biostat.wustl.edu 
[mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Marc Pelath
Sent: Monday, December 03, 2007 9:50 AM
To: 's-news@lists.biostat.wustl.edu'
Subject: [S] running out of dynamic memory when using predict.bdGlm

Hi everybody,

I'm estimating a binomial GLM on a large dataset (about 2.5M records, 100 
variables).  The model itself has about 20 variables, many of which are 
categorical, so the model itself has (at the moment) just over 100 parameters.  
SPLUS seems to estimate the model just fine, although of course it takes a 
while, and it produces the (sensible-looking) in-sample fits without 
complaints.  However, when I try to generate out-of-sample predictions using 
predict (where newdata = OutData has about 500k records), I get the dreaded 
"unable to obtain requested dynamic memory" error.  Traceback follows:

---
15: eval(action, sys.parent())
14: doErrorAction("Problem in bd.internal.exec.node(engine.class = 
\"com.insightful.miner.BDLManager$BDLSplusScri..: 
BDLManager$BDLSplusScriptEngineNode (0): Proble m in 
model.matrix.default(args$terms.object, IM$in1, args$contrasts.arg, 
args$xlevels): Unable to obtain requested dynamic memory",
13: stop(ret$error)
12: bd.internal.exec.node(engine.class = 
"com.insightful.miner.BDLManager$BDLSplusScriptEngineNode", node.props = 
node.props, inputs = in.bdFrame.lst, num.outputs =

11: list(
10: NULL
9: bd.block.apply(data, FUN = bd.internal.model.matrix.script, test = F, 
one.block = F, sample = F)
8: bd.internal.model.matrix(terms(pform), mf, contrasts = object$contrasts, 
xlevels = object$xlevels)
7: predict.bdGlm(sub.glm, OutData, type = "response")
6: predict(sub.glm, OutData, type = "response")
5: eval(i, local)
4: source(auto.print = auto.print, exprs = substitute(exprs.literal))
3: script.run(exprs.literal = {
2: eval(expression(script.run(exprs.literal = {
1:
Message: Problem in bd.internal.exec.node(engine.class = 
"com.insightful.miner.BDLManager$BDLSplusScri..: 
BDLManager$BDLSplusScriptEngineNode (0): Problem in model.
matrix.default(args$terms.object, IM$in1, args$contrasts.arg, args$xlevels): 
Unable to obtain requested dynamic memory
---

I'm at a loss to explain this, since it is using predict.bdGlm, and my 
understanding is that this is exactly the limitation that the bigdata library 
is supposed to address.  Clearly it's able to produce such results on a larger 
data set (namely, the sample used to estimate the model), so why would it choke 
on a smaller data set?

I'm running SPLUS 8.0 under Windows XP.  My RAM is 2G, and page file is about 
3G, although since it's supposed to be using bigdata routines, I'm not sure how 
this matters.  I also have about 100G free disk space.

I'm going to try chopping down the number of variables in the dataset to see if 
that helps, but I feel like I shouldn't have to.  Any ideas?  I'm hoping 
somebody has run into this problem before - it doesn't seem like an unusual 
situation.  I've searched the archives but couldn't find any guidance.

Thanks in advance, and hope I can return the favor someday,
Marc Pelath


<Prev in Thread] Current Thread [Next in Thread>