s-news
[Top] [All Lists]

Re: A (probably) trivial question about first occurrence in unique()

To: s-news@lists.biostat.wustl.edu
Subject: Re: A (probably) trivial question about first occurrence in unique()
From: Sarah Henderson <sarah.henderson@ubc.ca>
Date: Wed, 30 May 2007 14:54:41 -0700
In-reply-to: <07E228A5BE53C24CAD490193A7381BBB9E35F1@LP-EXCHVS07.CO.IHC.COM>
References: <7.0.1.0.2.20070530141341.0254bc50@ubc.ca> <07E228A5BE53C24CAD490193A7381BBB9E35F1@LP-EXCHVS07.CO.IHC.COM>

Thanks to Greg, Richard, Christopher and Alejandro for their helpful and varied responses. Only Christopher's did not get posted to the list, so here it is:

> Here is an (untested!) idea: let's say for concreteness the data are
> already in a data frame with columns STREETS and MUNIC (sounds like that's
> your setup). I would convert STREETS to a factor variable (if it is not
> already a factor), then sort the data correctly (by STREETS then MUNIC)
> then use tapply to select the first occurrence of MUNIC. Rough code:

> road.data <- road.data[order(road.data$STREETS,road.data$MUNIC),]
> tapply(road.data$MUNIC, road.data$STREETS, function(x) x[1])

> Of course sorting MUNIC in S-Plus probably doesn't respect the North-South
> ordering, so that would have to be done differently (add a variable that
> codes the ordering and sort by that, for instance). It sounds like you
> already have the data sorted how you want, so perhaps you just need the
> tapply statement.

Greg's duplicates() suggestion was the easiest to implement for my quick-and-dirty investigation.

Thanks again to everyone.  This really is a great list.

Sarah




______________________________________________

Sarah Henderson
Department of Health Care & Epidemiology
University of British Columbia
www.firesmoke.ubc.ca

Office: 604.822.1274
Cell: 604.910.9144
Fax: 604.822.9588

Michael Smith Foundation for Health Research Trainee
Canadian Institutes of Health Research Trainee

<Prev in Thread] Current Thread [Next in Thread>