s-news
[Top] [All Lists]

Re: truncate dataset

To: <theforester@comcast.net>, <s-news@lists.biostat.wustl.edu>
Subject: Re: truncate dataset
From: <Rich@Mango-Solutions.com>
Date: Wed, 21 Feb 2007 22:43:15 -0000
In-reply-to: <022120071658.8243.45DC7A200003671F0000203322092299279D0A9B9C0A9D01000A089B@comcast.net>
Thread-index: AcdV3H1/yXR4+XKoTsOXuSnHZ2tgeAALFdeg

Hi Keith

How about something like this:

> # I assume my dataset is called "df" in this example

> sRow <- seq(nrow(df))

> df[sRow > quantile(sRow, .2) & sRow < quantile(sRow, .8), ]

Hope this helps,

Rich.

mangosolutions

data analysis that delivers

Tel    +44 1249 467 467

Fax   +44 1249 467 468

-----Original Message-----
From: theforester@comcast.net [mailto:theforester@comcast.net]
Sent: 21 February 2007 16:58
To: s-news@lists.biostat.wustl.edu
Subject: [S] truncate dataset

All,

I am trying to eliminate the effects of age upon my analysis of growth vs category.  I have two categories and, for each, I would like to truncate the top 20% and the bottom 20% of the data subset.

Two categories: A & B

Dependent var: growth

Independent var: measures of mixture

The number of elements in each category is not the same.  Also, the distribution of category A is not necessarily the same as the distribution of category B;  e.g., the bottom 20% of A, for example, would not necessarily be the same age as the bottom 20% of B.

Does anyone have suggestions about the syntax for extracting or subsetting the dataset to only look at the middle 60% of each category?

I couldn't find anything in the archives, but perhaps my search keywords were not correct.

Thank you in advance.

Keith

--------------------------------------------------------------------

This message was distributed by s-news@lists.biostat.wustl.edu.  To

unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with

the BODY of the message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>