s-news
[Top] [All Lists]

Re: sorting a dataframe

To: "Data Analytics Corp." <dataanalytics@earthlink.net>, "s-news@lists.biostat.wustl.edu" <s-news@lists.biostat.wustl.edu>
Subject: Re: sorting a dataframe
From: "Austin, Matt" <maustin@amgen.com>
Date: Mon, 28 Jan 2008 08:41:32 -0800
Accept-language: en-US
Acceptlanguage: en-US
In-reply-to: <479D2C0C.3020507@earthlink.net>
Thread-index: AchhSmX6Uro9yIo6RN+BSLRiLsY6jwAgb7kA
Thread-topic: [S] sorting a dataframe
It would appear that your month variable is not numeric.

For example, the following gives what you want.  Year and month are both 
numeric.

> temp <- data.frame(year = c(rep(2007, length.out = length(4:12)), 2008), 
> month = c(4:12, 1))
> sort.col(temp, "@ALL", 1:2, ascending = T)
   year month
 1 2007     4
 2 2007     5
 3 2007     6
 4 2007     7
 5 2007     8
 6 2007     9
 7 2007    10
 8 2007    11
 9 2007    12
10 2008     1


This example gives a different sorting because month is now a character.

> temp2 <- data.frame(year = c(rep(2007, length.out = length(4:12)), 2008), 
> month
         = as.character(c(4:12, 1)))
> sort.col(temp2, "@ALL", 1:2, ascending = T)
   year month
 7 2007    10
 8 2007    11
 9 2007    12
 1 2007     4
 2 2007     5
 3 2007     6
 4 2007     7
 5 2007     8
 6 2007     9
10 2008     1

I haven't been able to duplicate your exact order, but a naïve guess whould be 
that your month is a factor and the levels are dictating the sort order that 
you are seeing.

Try levels(x$month) and see what you get.

--Matt

Matt Austin
Director, Biostatistics
Amgen, Inc

-----Original Message-----
From: s-news-owner@lists.biostat.wustl.edu 
[mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Data Analytics Corp.
Sent: Sunday, January 27, 2008 5:13 PM
To: s-news@lists.biostat.wustl.edu
Subject: [S] sorting a dataframe

Hi,

I have a simple dataframe, x, of two columns:

   year  month
   2008  1
   2007  4
   2007  5
   2007  6
   2007  7
   2007  8
   2007  9
   2007  10
   2007  11
   2007  12

that I want to sort by year, then month.  Forget the fact that I could just 
move the first row to the end of x to get what I want; this is just an example. 
 I used

          sort.col(x, "@ALL", 1:2, ascending = T)

and got

   year month
   2007  8
   2007  9
   2007  10
   2007  11
   2007  12
   2007  4
   2007  5
   2007  6
   2007  7
   2008  1

when, of course, I should have gotten

   year  month
   2007  4
   2007  5
   2007  6
   2007  7
   2007  8
   2007  9
   2007  10
   2007  11
   2007  12
   2008  1

I also tried sort.col(x, 1:2, 1:2) and got the same result.  Why?  This
should be a simple sort.   I put the data into Excel (sorry about that)
and got the correct sort.  Why not here?

Thanks,

Walt Paczkowski

--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To unsubscribe 
send e-mail to s-news-request@lists.biostat.wustl.edu with the BODY of the 
message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>