| To: | s-news@lists.biostat.wustl.edu |
|---|---|
| Subject: | Re: |
| From: | "Donald Catanzaro, PhD" <dgcatanzaro@gmail.com> |
| Date: | Thu, 25 Sep 2008 15:30:15 -0500 |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:reply-to :user-agent:mime-version:to:subject:content-type :content-transfer-encoding; bh=xl9Kk17s8SMdTZ++9B2y0+pGI/L1YD++KzWPpAckhGM=; b=EM3KUUb1RdfK1qkF/jzUz5azGKqp39QTnHBSRS+8ykSNcYE8MophBymgf7YIKauo/L zvkhz7oGR9q9xSJHRQoXUmd6D6xAeiiZ/1az5nQC4o68Mraz5BsG3AlGx/I8rRs1wXFk F7OxJJZxvND54xcSXqeq7Zz44BklThBa3Fnog= |
| Domainkey-signature: | a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=DWLjRpUOKg63iTjDCRXi7s/79rWVZrwqQphzgmHqajLBf3YNTuLtpbe/DuAhmSZZPC N1wlo3OQlCxgHCD9uCQldp1UxqRcEDcKQ2d7QfC+kC3LWlTIugAFirl32jQfd5oRlvHO MHgy+rIrlfE1n75oXQgoSujgPtcIvF68Rqwjw= |
| Reply-to: | dgcatanzaro@gmail.com |
| User-agent: | Thunderbird 2.0.0.16 (Windows/20080708) |
Hi All,My apologies to the list as I lurch forward in my humble quest to cross-validate my dataset. As folks have seen it is going rather slower than I had hoped which is mainly due to my own lacking than anything else. I've been working on subsetting my dataset into an 80/20 split and creating a model with the 80% data and then using the remaining 20% for model validation. For performance measures of the 80% model I'd like to use the AIC and BIC coming from the 20% validation dataset. It is rather nice that glm includes a subset option so I can create my model using 80% of the data when supplied with the correct vector. Is there a similar option where I can run the 20% data through the 80% derived GLM and thus pull out the deviance & log-likelihoods without additional calculations ? If not, if I understand correctly, my other option would be to: A) predict the 80% data points from the 80% model B) find mu and size of the 80% predicted data points using fitdistrC) calculate the log likelihood of the 20% validation dataset using mu and size from the 80% predicted data points D) calculate AIC and BIC from that log-likelihoodIf I can't run the 20% data through my 80% model, would A-D get me where I'd like to be ? ---Don Don Catanzaro, PhD Landscape Ecologist dgcatanzaro@gmail.com 16144 Sigmond Lane 479-751-3616 Lowell, AR 72745 |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | RE Color spec. Postscript graphic device (Summary), Ilouga, Pierre |
|---|---|
| Next by Date: | Re:, Frank E Harrell Jr |
| Previous by Thread: | Color spec. Postscript graphic device, Ilouga, Pierre |
| Next by Thread: | Re:, Frank E Harrell Jr |
| Indexes: | [Date] [Thread] [Top] [All Lists] |