s-news
[Top] [All Lists]

Re: ROC curves in S?

To: <nkoper@yahoo.com>
Subject: Re: ROC curves in S?
From: "Eric C. Lanes" <Lanesec@michigan.gov>
Date: Fri, 09 Dec 2005 11:13:07 -0500
Cc: <s-news@wubios.wustl.edu>
Nicola,

I'll try to be as helpful as I can here and not confuse what is a relatively 
simple request on your part:

I am not an S-Plus user but am aware of the many similarities between this and 
R. I do not profess to know at this point if there is a true S-plus equivalent 
for the ROCR package for R.

You may draw the ROC curve if you like just to eyeball it. Its relative 
"roughness" or "smoothness" may at least give you a visual of just how varied, 
explempary or non-exemplary your cases are probability-wise when, for example, 
a multivariate model (y~x1+x2....xK) has been used. That is, how "consistently" 
cases "behave" in terms of the spread of predicted probabilities and their 
"distance" from .50 when applying a particular model or when comparing models. 
The AUC  ("C" in lrm) is merely a summary measure of this "behavior". 

Harrell has rightly noted in response to your query the problems associated 
with the arbitrary SELECTION of cutpoints on predicted probabilities, etc.

In R, the ROCR package generates ROC/AUC, false-positive, true-positive, error 
in prediction and a good number of other measures associated with the 
classification task (see documentation .pdf  on CRAN for the list).

The use of lrm for logistic models is fine (also if you have a polytomous y). 
However, there are other procedures which I find are more flexible 
attirubute-wise for the binary outcome y (depending upon what you'd like to 
do): logistf for one, GLM is another.

For example:

library(ROCR)
library(logistf)
fit=logistf(y~x1+x2...., p=T)   ### You may: y~x1+x2+...-1 to get rid of 
intercept if so desired and p=F for no penalty ###
exp(fit$coefficients) ### Risk associated with each x ###
fitpred=prediction(fit$predict,fit$y)
fitperf=performance(fitpred, "auc")

Hope you find this helpful,

Eric Lanes, Ph.D.
Michigan Department of Corrections





>>> Frank E Harrell Jr <f.harrell@vanderbilt.edu> 12/8/05 4:00:38 PM >>>
Nicola Koper wrote:
> Is there a way to generate ROC curves in Splus for
> logistic regression calculated using GLM? I'm using
> version 6.2.
> 
> Thanks,
> Nicky

Don't draw the curve, which can lead people to select arbitrary 
cutpoints on the predicted probabilties and to use an improper loss 
(utility) function.  But the area under the ROC curve is a good 
discrimination measure.  You get it (C index) by

library(Hmisc,T)
library(Design,T)
f <- lrm(y ~ x1 + x2 + ....)
f   # prints C and other rank measures of predictive discrimination

Frank Harerll


> 
> ************************************************************
> 
> Dr. Nicola Koper
> Assistant Professor
> Natural Resources Institute
> University of Manitoba
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news


<Prev in Thread] Current Thread [Next in Thread>