s-news
[Top] [All Lists]

Re: Data Manipulation for SAS User

To: Wim Kimmerer <kimmerer@sfsu.edu>
Subject: Re: Data Manipulation for SAS User
From: Tony Plate <tplate@blackmesacapital.com>
Date: Fri, 15 Apr 2005 09:10:23 -0600
Cc: Joseph E Wolfe <Joseph_E_Wolfe@notes.ntrs.com>, s-news@wubios.wustl.edu
In-reply-to: <5.2.1.1.2.20050413223301.00bb0c48@sfsu.edu>
References: <5.2.1.1.2.20050413223301.00bb0c48@sfsu.edu>
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
I got around the sometimes surprising effects of the interaction of time zones with unwanted times on timeDate variables by setting options(time.zone="GMT") in my S.init file. As far as I've been able to tell, with this setting all functions behave intuitively with timeDate variables that have no time. And dates represented as character vectors with no time get converted to a timeDate object with zero as the number of milliseconds.

But, someone else might have a better solution...

As for aggregation and merging, AFAIK the functions mergeSeries and aggregateSeries work well with the timeDate vector that defines the positions of a timeSeries object.

-- Tony Plate

PS: an example of the hidden traps of working with timeDate variables is the following different answers where one might naively expect the same answers:

     > options(time.zone="MDT")
     > wdydy(dates("01/01/1998"))
       weekday yearday year
     1       3     365 1997
     > wdydy("1998/01/01")
       weekday yearday year
     1       4       1 1998

Another example is that in a non-GMT time zone, the result of as.numeric() applied to a timeDate with no time specified is fractional:
      > as.numeric(timeDate("2001/01/01",zone="GMT"))
      [1] 14976
      > as.numeric(timeDate("2001/01/01",zone="MDT"))
      [1] 14976.29
      >

One can't simply floor() on a timeDate with non-GMT zone to get an integer that represents the number of days, because that can roll over into the next day when there is a non-zero time on the date, and even with zero times, the direction of rounding depends on whether the time zone is before or after GMT.

And yes, there are ways around all of these problems, and the powerful timeDate functions for aligning and calculating relative dates are very useful. But they're not simple, and still there are so many tricky gotchas that I've found the easiest thing to do is have options(time.zone="GMT") all the time.

Wim Kimmerer wrote:
A caution regarding dates: in recent versions of Splus dates are in "timeDate" format, which I have always found difficult to work with (I am being polite here). The main problems with it are 1) it is actually two variables somehow stuffed into a single apparent "column" in the data frame; 2) it insists on having a time variable even if all you care about is the date, so you get situations where daylight savings time changes or you change time zones and things stop matching up the way they should.

Merging with timeDate, and using other functions that ought to work just fine (such as "by" and "aggregate") get all buggered up with timeDate variables. When I have to merge things by date (and time is not an issue) I convert the date to year and julian day and do my calculations, then convert back if I have to.

I do what you want to do all the time - calculate the mean for some set of blocking variables and then merge back to the original data, usually for the purpose of calculating anomalies (e.g., subtracting out seasonal signals). It is easy enough to write functions to do that in one line of code if it is something you do a lot.

Splus may seem strange and impenetrable but hang in there - pretty soon it will seem quite familiar (although still impenetrable).

Wim


======================
Dr. Wim Kimmerer
Research Professor of Biology
Romberg Tiburon Center
San Francisco State University
3152 Paradise Drive
Tiburon CA 94920
Ph. (415) 338-3515
Fax (415) 435-7120
http://online.sfsu.edu/~kimmerer/ --------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news



<Prev in Thread] Current Thread [Next in Thread>