- 1. Re: bootstrapped data-splitting into "test" and "training" sets (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Sun, 2 May 2004 19:51:18 -0400
- The misclassification rate is an improper scoring rule, i.e., it is optimized by a bogus model. It can even increase when an important variable is added to the model. I suggest you use a proper scori
- /archives/html/s-news/2004-05/msg00005.html (10,008 bytes)
- 2. Re: code for classification success (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Wed, 7 Apr 2004 18:30:12 -0500
- The percent correct is an "improper scoring rule", i.e., it may be optimized by a bogus model. There are much better measures. Many of them are implemented in the val.prob function in the Hmisc libra
- /archives/html/s-news/2004-04/msg00029.html (9,021 bytes)
- 3. Re: code for classification success (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Wed, 7 Apr 2004 19:11:00 -0500
- My mistake - it's in the Design library. -FH -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
- /archives/html/s-news/2004-04/msg00030.html (7,920 bytes)
- 4. Re: goodness of fit (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Thu, 8 Apr 2004 10:48:25 -0500
- Do library(Hmisc,T) library(Design,T) ?residuals.lrm # look at type='gof' A better approach is to do directed tests of linearity and additivity (no interaction). -FH -- Frank E Harrell Jr Professor a
- /archives/html/s-news/2004-04/msg00040.html (9,156 bytes)
- 5. Re: goodness of fit (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Thu, 8 Apr 2004 11:21:28 -0500
- A directed goodness of fit test is just a likelihood ratio or Wald chi-square test for the added variables, similar to the way you are proceeding. FH -- Frank E Harrell Jr Professor and Chair School
- /archives/html/s-news/2004-04/msg00041.html (11,041 bytes)
- 6. Re: S-PLUS Vs some other softwares (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Wed, 3 Mar 2004 08:49:53 -0500
- A slightly faster approach may be to use the following function which is part of the Hmisc library. The S-Plus version is shown below. x is a matrix, y a vector. lm.fit.qr.bare <- function(x, y, tole
- /archives/html/s-news/2004-03/msg00034.html (11,049 bytes)
- 7. Re: question (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Wed, 10 Mar 2004 13:09:51 -0600
- In the Design library look at help files for validate.lrm and residuals.lrm. The latter uses approximate leave-out-one estimates for examining influence. The former will do bootstrap and x-fold cross
- /archives/html/s-news/2004-03/msg00106.html (7,378 bytes)
- 8. Re: logistic regression (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Sat, 13 Mar 2004 13:46:40 -0500
- This is not really an S-Plus question but is much more a question about statistical methodology. It would be beneficial to study one or more of the several books that cover logistic regression, then
- /archives/html/s-news/2004-03/msg00126.html (8,878 bytes)
- 9. Re: Function( ) function with sascode( ) (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Tue, 23 Mar 2004 16:15:03 -0600
- You did not provide the version of S-Plus, the version of Design, or the operating system, which makes answering your question harder. And if you can data.dump( ) the turesp0304 object I could try to
- /archives/html/s-news/2004-03/msg00175.html (9,465 bytes)
- 10. Re: plsmo (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Fri, 26 Mar 2004 15:41:26 -0500
- Ignore that warning If you can send a minimal example that fails I will fix the problem. It's best to simulate data and send the code that did that. Second best is to subset your data as small as you
- /archives/html/s-news/2004-03/msg00215.html (8,214 bytes)
- 11. Re: plsmo (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Fri, 26 Mar 2004 16:52:54 -0500
- Right, I was assuming this was S-Plus because of the posting to s-news. save(...., compress=T) is a great way to share data in R. I wish S-Plus had it. -Frank -- Frank E Harrell Jr Professor and Chai
- /archives/html/s-news/2004-03/msg00219.html (11,599 bytes)
- 12. Re: ancova model checking (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Mon, 23 Feb 2004 08:23:06 -0500
- There are faster systematic ways to go about this, but any of these procedures are likely to destroy the meaning of P-values, confidence limits, and result in biased regression models with overly opt
- /archives/html/s-news/2004-02/msg00184.html (8,680 bytes)
- 13. 4 (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Mon, 26 Jan 2004 09:46:53 -0500
- Thanks very much for the fix Willi. It will be in the next release of Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
- /archives/html/s-news/2004-01/msg00144.html (7,681 bytes)
- 14. g (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Thu, 29 Jan 2004 14:49:19 -0600
- reasonable not very reasonable. Better to do data reduction (ignoring Y) and fit a full model on the reduced set. Good idea to do a double bootstrap for this type of strategy Why? This has severe pro
- /archives/html/s-news/2004-01/msg00179.html (9,641 bytes)
- 15. Re: plot of survival probability vs. covariate (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Thu, 11 Dec 2003 20:35:37 -0500
- This will get you started. Look into the documentation for confidence limits. library(Design) # if S-Plus do library(Hmisc,T);library(Design,T) dd <- datadist(yourdataframe) options(datadist='dd') f
- /archives/html/s-news/2003-12/msg00070.html (9,573 bytes)
- 16. Re: splitting data (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Wed, 17 Dec 2003 18:51:02 -0500
- Also look at the cut2 function in the Hmisc library. -FH -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
- /archives/html/s-news/2003-12/msg00099.html (10,806 bytes)
- 17. Re: cv.tree (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Tue, 4 Nov 2003 12:02:32 -0500
- This appears to be working correctly. Unless you have huge sample sizes trees can be quite unstable. If you were to simulate datasets like yours multiple times, each time regrowing a tree, you would
- /archives/html/s-news/2003-11/msg00016.html (9,148 bytes)
- 18. Re: test for interaction (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Fri, 14 Nov 2003 17:27:21 -0500
- r
- /archives/html/s-news/2003-11/msg00102.html (8,841 bytes)
- 19. Re: test for interaction (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Fri, 14 Nov 2003 19:44:09 -0500
- *
- /archives/html/s-news/2003-11/msg00104.html (12,120 bytes)
- 20. Re: creating decile ranking (score: 1)
- Author: Frank E Harrell Jr <feh3k@spamcop.net>
- Date: Tue, 25 Nov 2003 08:08:49 -0500
- ere
- /archives/html/s-news/2003-11/msg00170.html (8,221 bytes)
This search system is powered by
Namazu