s-news
[Top] [All Lists]

Re: bootstrapping question

To: "Walter R. Paczkowski" <dataanalytics@earthlink.net>
Subject: Re: bootstrapping question
From: Tim Hesterberg <timh@insightful.com>
Date: 27 Oct 2004 08:17:21 -0700
Cc: S-news <s-news@lists.biostat.wustl.edu>
In-reply-to: <417FBF0B.6000304@earthlink.net> (dataanalytics@earthlink.net)
References: <417FBF0B.6000304@earthlink.net>
Instead of
        mean(apply(data.test, 2, mean))
use
        mean(colMeans(data.test))


Your resampling doesn't seem to take into account the two-sample nature
of the problem.  You can do a two-sample bootstrap using 
        "bootstrap2",
for confidence intervals.  For a hypothesis test you should
do a permutation test instead; see 
        "permutationTest2".
These are part of the S+Resample library; see below my signature.

>Hi,
>
>I have a question on how to set up a bootstrapping problem in S-Plus (v. 
>6.2.1 on Windows).  The basic problem involves survey data.   Each 
>survey respondent was asked to rate 20 messages about a product on, 
>let's say, a 0/1 scale where 1 is agree.  There are 30 products.  My 
>client found the proportion of respondents who agreed (a "1") with the 
>message for a product and then averaged over the 20 messages to create a 
>single score for all the messages for that product.  He repeated this 
>for all 30 products.  He did this in 2002 and 2003.  He asked me to 
>calculate the standard errors and do a test to see if the score in 2003 
>differs from the score in 2002 for each product.  The answers will 
>determine compensation for management.
>
>I want to do a bootstrap to calculate the standard errors and the mean 
>scores and then use these to test the difference.  I set up a simple 
>bootstrap is S using
>
>        bootstrap(data.test, mean(apply(data.test, 2, mean)))
>
>where data.test is a data frame of three randomly generated messages 
>coded as  0/1 numbers (the columns) with 20 "respondents".  This 
>represents just a single product.  The mean was used since the mean of 
>the 0/1 data is just the sample proportion.  The actual data, when I get 
>it, will have 20 messages and several thousand respondents for each of 
>the 30 products.
>
>This call to bootstrap gave me what I think I need, but since I've used 
>this function so infrequently and am not that familiar with bootstrap, I 
>need clarification on what it's doing.  I believe it's resampling 
>data.test, then passing the resampled data frame to the function 
>mean(apply...)).  Is this correct?  Based on the description above of 
>what the client did, is this function giving me what I want?  Can I do a 
>simple hypothesis test using the results?  Can I assume normality for 
>that test?  Is there anything else I need to consider?
>
>Thanks for any help or advice,
>
>Walt Paczkowski

Tim Hesterberg

========================================================
| Tim Hesterberg       Research Scientist              |
| timh@insightful.com  Insightful Corp.                |
| (206)283-8691        1700 Westlake Ave. N, Suite 500 |
| (206)802-2500 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Download the S+Resample library from www.insightful.com/downloads/libraries


<Prev in Thread] Current Thread [Next in Thread>