s-news
[Top] [All Lists]

Re: General-purpose function for sorting data frame

To: "'Jean V Adams'" <jvadams@usgs.gov>, kwright@eskimo.com
Subject: Re: General-purpose function for sorting data frame
From: "Liaw, Andy" <andy_liaw@merck.com>
Date: Wed, 29 Sep 2004 10:59:31 -0400
Cc: s-news@lists.biostat.wustl.edu, Thom.Burnett@cognigencorp.com
Am I missing something?  Passing rev() to order() isn't going to do
anything, if I have enough coffee this morning.  For numeric, one can pass
negatives to order(), but for character/factor, something else is needed, I
believe.

Andy

> From: Jean V Adams
> 
> Isn't using rev() a more convenient approach?
> 
> In your example:
> Oats[order(rev(Oats$nitro), Oats$Variety), ]
> 
> For Thom Burnett's recently posted example:
> attach(unsortedData)
> sortedData <- unsortedData[order(sex, rev(race), 
> rev(smoking), weight), ]
> detach("unsortedData")
> 
> JVA
> 
> `·.,,  ><(((º>   `·.,,  ><(((º>   `·.,,  ><(((º>
> 
> Jean V. Adams
> Statistician
> U.S. Geological Survey
> Great Lakes Science Center
> c/o Marquette Biological Station
> 1924 Industrial Parkway
> Marquette, MI 49855  USA
> phone: 906-226-1212
> FAX: 906-226-3632
> web site: www.glsc.usgs.gov
> e-mail: jvadams@usgs.gov
> 
> 
>                                                               
>                                                               
>              
>                       Kevin Wright                            
>                                                               
>              
>                       <kwright@eskimo.com>             To:    
>    s-news@lists.biostat.wustl.edu                             
>              
>                       Sent by:                         cc:    
>                                                               
>              
>                       s-news-owner@lists.biosta        
> Subject:  [S] General-purpose function for sorting data frame 
>                     
>                       t.wustl.edu                             
>                                                               
>              
>                                                               
>                                                               
>              
>                                                               
>                                                               
>              
>                       09/29/2004 10:04 AM                     
>                                                               
>              
>                                                               
>                                                               
>              
>                                                               
>                                                               
>              
> 
> 
> One of the often asked questions on s-news and r-help relates 
> to sorting
> a data frame in ascending/descending order by multiple columns.
> 
> I have created a function that does the job very flexibly and 
> in a way that
> should be easy to remember.  For example, to sort the Oats 
> data.frame (nlme
> library) by nitro (decreasing) and Variety (increasing) :
> 
> sort.data.frame(Oats, ~ -nitro + Variety)
> 
> Feedback and improvements are welcome.
> 
> sort.data.frame <- function(form,dat){  # Author: Kevin Wright
>   # Some ideas from Andy Liaw
>   #   http://tolstoy.newcastle.edu.au/R/help/04/07/1076.html
> 
>   # Use + for ascending, - for decending.
>   # Sorting is left to right in the formula
> 
>   # Useage is either of the following:
>   # library(nlme); data(Oats)
>   # sort.data.frame(~-Variety+Block,Oats) # Note: levels(Oats$Block)
>   # sort.data.frame(Oats,~nitro-Variety)
> 
>   # If dat is the formula, then switch form and dat
>   if(inherits(dat,"formula")){    f=dat
>     dat=form
>     form=f
>   }
>   if(form[[1]] != "~")
>     stop("Formula must be one-sided.")
> 
>   # Make the formula into character and remove spaces
>   formc <- as.character(form[2])
>   formc <- gsub(" ","",formc)
>   # If the first character is not + or -, add +
>   if(!is.element(substring(formc,1,1),c("+","-")))
>     formc <- paste("+",formc,sep="")
> 
>   # Extract the variables from the formula
>   if(exists("is.R") && is.R()){    vars <- unlist(strsplit(formc,
> "[\\+\\-]"))
>   }
>   else{    vars <- unlist(lapply(unpaste(formc,"-"),unpaste,"+"))
>   }
>   vars <- vars[vars!=""] # Remove spurious "" terms
> 
>   # Build a list of arguments to pass to "order" function
>   calllist <- list()
>   pos=1 # Position of + or -
>   for(i in 1:length(vars)){    varsign <- substring(formc,pos,pos)
>     pos <- pos+1+nchar(vars[i])
>     if(is.factor(dat[,vars[i]])){      if(varsign=="-")
>         calllist[[i]] <- -rank(dat[,vars[i]])
>       else
>         calllist[[i]] <- rank(dat[,vars[i]])
>     }
>     else {
>       if(varsign=="-")
>         calllist[[i]] <- -dat[,vars[i]]
>       else
>         calllist[[i]] <- dat[,vars[i]]
>     }
>   }
>   dat[do.call("order",calllist),]
> 
> }
> 
> 
> 
> --------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments, contains 
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New 
Jersey, USA 08889), and/or its affiliates (which may be known outside the 
United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as 
Banyu) that may be confidential, proprietary copyrighted and/or legally 
privileged. It is intended solely for the use of the individual or entity named 
on this message.  If you are not the intended recipient, and have received this 
message in error, please notify us immediately by reply e-mail and then delete 
it from your system.
------------------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>