s-news
[Top] [All Lists]

Re: How to create a "summary" data frame

To: "'Hunsicker, Lawrence'" <lawrence-hunsicker@uiowa.edu>, <s-news@lists.biostat.wustl.edu>
Subject: Re: How to create a "summary" data frame
From: "Alan Hochberg" <alan.hochberg@prosanos.com>
Date: Mon, 22 Dec 2008 17:29:34 -0500
In-reply-to: <2B80F69A8A189D48B0E668B0BBC6BA4201E010FF@HC-MAIL13.healthcare.uiowa.edu>
References: <2B80F69A8A189D48B0E668B0BBC6BA4201E010FF@HC-MAIL13.healthcare.uiowa.edu>
Thread-index: AclkgXtcoC+OxQCaRiC9gxzViQ/c+wAAt2dw

How about this?

 

> theData <- data.frame(Center=paste("CTR",as.character(round(runif(1000,min=0,max=99),0)),sep=""), testDone=rbinom(1000,1,0.3))

> by(theData,theData$Center,function(x) {mean(x$testDone)})

theData$Center:CTR0

[1] 0.5714286

---------------------------------------------------------------------------------------------------------

theData$Center:CTR1

[1] 0.1

---------------------------------------------------------------------------------------------------------

theData$Center:CTR10

[1] 0.1111111

---------------------------------------------------------------------------------------------------------

theData$Center:CTR11

[1] 0.2

---------------------------------------------------------------------------------------------------------

theData$Center:CTR12

[1] 0.3076923

---------------------------------------------------------------------------------------------------------

theData$Center:CTR13

[1] 0.4

---------------------------------------------------------------------------------------------------------

theData$Center:CTR14

[1] 0.25

. . .

 

Note that Center must be a factor in order for this to work (according to the documentation for “by”).

 

Hope this helps and Happy Holidays,

 

Alan

 

Alan Hochberg

VP, Research

ProSanos Corporation

225 Market St. Ste. 502,

Harrisburg, PA 17101

Tel 717-635-2124 * Fax 717-635-2575

 

 

 


From: s-news-owner@lists.biostat.wustl.edu [mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Hunsicker, Lawrence
Sent: Monday, December 22, 2008 5:06 PM
To: s-news@lists.biostat.wustl.edu
Subject: [S] How to create a "summary" data frame

 

Hi, folks, and Happy Holidays to all:

I have a data frame with about 11,000 patients from about 600 different centers.  Roughly half of these patients have had a certain test done, and the other half have not had the test.  But the fraction with the test varies from center to center.  I’d like to add a column to the data frame indicating the fraction of patients at each center that had the test done.  I tried doing this using the GUI to calculate the average of 0 (no) and 1 (yes) values, doing the average by center, and saving the result.  I get a list with a single value (the average) for each center, but the center IDs are not included, so that I can’t do a merge on center ID.  How can I create a data frame with two columns, the first column being the center number, and the second being the fraction of patients with the test done?

Larry Hunsicker

<Prev in Thread] Current Thread [Next in Thread>