s-news
[Top] [All Lists]

Re: proportion of states life expectancy > 70

To: "Douglas Bates" <bates@stat.wisc.edu>
Subject: Re: proportion of states life expectancy > 70
From: Tim Hesterberg <timh@insightful.com>
Date: Thu, 06 Dec 2007 11:40:56 -0800
Cc: "bhavin toprani" <b_toprani@hotmail.com>, "s-news@lists.biostat.wustl.edu" <s-news@lists.biostat.wustl.edu>
In-reply-to: <40e66e0b0712061052o4e33c1a0x42848de7f53854fd@mail.gmail.com> (bates@stat.wisc.edu)
References: <BAY114-W117569695E180FAC4F7DB98D770@phx.gbl> <40e66e0b0712061052o4e33c1a0x42848de7f53854fd@mail.gmail.com>
A minor improvement on Bates' answer is to use mean() instead of sum()
  mean(mydf$Life.Exp > 70)
  mean(mydf$Life.Exp > 70, na.rm = TRUE)

>On Nov 27, 2007 10:38 PM, bhavin toprani <b_toprani@hotmail.com> wrote:
>
>> Dear all,
>
>>  Could you please suggest an easy command eg, if, for ,etc to find the
>> proportion of states having life expectancy greater than 70?
>
>>  Thanks for your help in advance.
>
>>  example of dataframe
>>              Life.Exp
>>  Alabama   69.05
>>  Alaska      69.31
>>  Arizona     70.55
>>  Arkansas   70.66
>>  California  71.71
>>  Colorado   72.06
>
>I haven't seen a response to this.  The general way to approach this
>is to sum the logical indicators.  When the TRUE/FALSE values are used
>in an arithmetic expression they are converted to 1/0 values.
>Suppose that your data frame is called mydf.  Then the expression
>
>sum(mydf$Life.Exp > 70)/nrow(mydf$Life.Exp)
>
>should do it.  If you want to be careful about the possibility of
>missing data you should use
>
>sum(mydf$Life.Exp > 70, na.rm = TRUE)/sum(!is.na(mydf$Life.Exp))
>
>instead.

<Prev in Thread] Current Thread [Next in Thread>