s-news
[Top] [All Lists]

Tricky resampling problem

To: <s-news@lists.biostat.wustl.edu>
Subject: Tricky resampling problem
From: "Jan Ivanouw" <ivanouw@post8.tele.dk>
Date: Thu, 18 Aug 2005 13:56:23 +0200
In a psychological experiment each subject is observed a certain number of 
times (O). The number of observations is determined by a process influenced by 
the subject. At each observation four categorial variables (a, b, c and d) are 
independently determined as present or absent (i.e. for instance both a and c 
may be present in the same observation). For each subject the number of 
occurences (Na, Nb, Nc and Nd) and the relative frequency (Ra, Rb, Rc and Rd) 
of the catecorial variables are calculated across the observations. In my 
sample 150 subjects each have a number of observations between 14 and 50. For 
the whole sample I calculate the statistics of mean, SD, median, skewness and 
kurtosis for each of the variables O, Na, Nb, Nc, Nd, Ra, Rb, Rc and Rd across 
the subjects. 

Now, I want to compare these statistics with the results from other similar 
studies, but my distribution of observations, O, is different because I have 
more subjects with a low and with a high number of observarions, and I suspect 
a sligtly different psychological process is involved for persons with very low 
and very high O. I therefore want to recalculate the statistics on a randomly 
selected subset of my sample which shall have an O-distribution more like the 
other studies. For this purpose I randomly draw for instance 20% of the 
subjects with O=14, 25% with O=15, 30% of subjects with =17,18,19, etc. and 
calculate the statistics on this reduced sample.

I need a S-PLUS function which, given as arguments: the data set, the selection 
frequencies for each O (20%, 25%, 30% etc in my example) and number of 
resamples, selects different resamples, calculates the statistics for the 
variables from each of these resamples and returns for each variable and 
statistic the mean and confidence limits from the resample distribution.

I have tried to figure out how to do this using the S+ SAMPLE and/or the 
BOOTSTRAP functions to do this. I tried both the Davison/Hinkley BOOT, the 
standard S+ version 6.1 BOOTSTRAP and the revised S+Resample library BOOTSTRAP, 
but have not succeded in solving the problem and I would be very glad for any 
suggestions

Thanks,

Jan Ivanouw PhD
Dep of Health Psychology
University of Copenhagen
Denmark
ivanouw@post8.tele.dk 



<Prev in Thread] Current Thread [Next in Thread>
  • Tricky resampling problem, Jan Ivanouw <=