One of the often asked questions on s-news and r-help relates to sorting
a data frame in ascending/descending order by multiple columns.
I have created a function that does the job very flexibly and in a way that
should be easy to remember. For example, to sort the Oats data.frame (nlme
library) by nitro (decreasing) and Variety (increasing) :
sort.data.frame(Oats, ~ -nitro + Variety)
Feedback and improvements are welcome.
sort.data.frame <- function(form,dat){
# Author: Kevin Wright
# Some ideas from Andy Liaw
# http://tolstoy.newcastle.edu.au/R/help/04/07/1076.html
# Use + for ascending, - for decending.
# Sorting is left to right in the formula
# Useage is either of the following:
# library(nlme); data(Oats)
# sort.data.frame(~-Variety+Block,Oats) # Note: levels(Oats$Block)
# sort.data.frame(Oats,~nitro-Variety)
# If dat is the formula, then switch form and dat
if(inherits(dat,"formula")){
f=dat
dat=form
form=f
}
if(form[[1]] != "~")
stop("Formula must be one-sided.")
# Make the formula into character and remove spaces
formc <- as.character(form[2])
formc <- gsub(" ","",formc)
# If the first character is not + or -, add +
if(!is.element(substring(formc,1,1),c("+","-")))
formc <- paste("+",formc,sep="")
# Extract the variables from the formula
if(exists("is.R") && is.R()){
vars <- unlist(strsplit(formc, "[\\+\\-]"))
}
else{
vars <- unlist(lapply(unpaste(formc,"-"),unpaste,"+"))
}
vars <- vars[vars!=""] # Remove spurious "" terms
# Build a list of arguments to pass to "order" function
calllist <- list()
pos=1 # Position of + or -
for(i in 1:length(vars)){
varsign <- substring(formc,pos,pos)
pos <- pos+1+nchar(vars[i])
if(is.factor(dat[,vars[i]])){
if(varsign=="-")
calllist[[i]] <- -rank(dat[,vars[i]])
else
calllist[[i]] <- rank(dat[,vars[i]])
}
else {
if(varsign=="-")
calllist[[i]] <- -dat[,vars[i]]
else
calllist[[i]] <- dat[,vars[i]]
}
}
dat[do.call("order",calllist),]
}
|