Thanks to Greg Snow and Don MacQueen for taking the time to answer my
question. The key aspects of their messages are attached below. In the end, I
have decided to follow Don's advice and start to get a clue about timeSeries
objects. This causes me to modify the constructor of my example as follows:
> version
Version 6.0 Release 1 for Sun SPARC, SunOS 5.6 : 2000
> set.seed(9)
> tsObj <- timeSeries( position = sort(as(c(round(runif(10)*100,1)+14000),
> "timeDate")), data = data.frame(x = rnorm(10)))
> tsObj
Positions x
05/09/1998 16:48:00.000 0.47039043
05/10/1998 04:48:00.000 -0.40612622
05/13/1998 09:36:00.000 -1.60787282
06/08/1998 07:12:00.000 -0.19931712
06/10/1998 02:24:00.000 0.08987583
06/28/1998 00:00:00.000 -1.31900056
07/03/1998 16:48:00.000 -2.15590363
07/08/1998 09:36:00.000 -0.06949902
07/08/1998 14:24:00.000 -0.19397886
07/24/1998 04:48:00.000 0.09826620
At this point, as Don points out, the use of aggregate() becomes quite
powerful. For example:
> aggregate(tsObj, by = "months", adj = 1, FUN = mean)
Positions x
06/01/1998 00:00:00.000 -0.5145362
07/01/1998 00:00:00.000 -0.4761473
08/01/1998 00:00:00.000 -0.5802788
Of course, this is only part way to my original question, but it seems clear
that the use of the "moving" argument along with a merge of the raw data onto
timeSeries object with all dates present will get me where I want to go.
##########################
Greg Snow said
here is something to get you started, you would need to change it to work
with dates instead of raw numbers, but the idea should work and be easy
enough to wrap in a function:
x <- c(1,2,3,7,8,9,11,25)
y <- c(10,11,12, 2,3,4, 25, 11)
avmat <- sapply( seq(along=x), function(i) as.numeric( abs(x-x[i])<=1 ) )
avmat <- sweep(avmat, 1, rowSums(avmat), "/")
x.ma <- avmat %*% x
y.ma <- avmat %*% y
x.ma
y.ma
#######################
Don MacQueen noted:
Here's a snippet from something I've done:
rmss <- aggregate(tsm,by='minutes',k.by=15,adj=1,FUN=function(x)
c(mean(x),length(x)))
tsm is a timeSeries object, created using the timeSeries() function
by='minutes' is pretty obvious
k.by=15 says aggregate over 15 minute interval
adj=1 says assign the result to the end of the 15 minute interval
(adj=0 would assign to the beginning, adj=0.5 to the middle)
FUN is any function you want that makes sense
aggregate() handles gaps in the time series just fine. That takes
care of the "irregular" part of your series of dates.
I thought timeSeries() would object to non-unique dates, so first I tried this:
(I changed the rounding)
> set.seed(9)
> tmpDF <- data.frame( date =
>sort(as(c(round(runif(10)*100,1)+14000), "timeDate")), x = rnorm(10))
> tmpDF$date@format <- "%m-%d-%02y %02H:%02M"
>
>tmpDF
>
date x
1 5-9-98 09:48 0.47039043
2 5-9-98 21:48 -0.40612622
3 5-13-98 02:36 -1.60787282
4 6-8-98 00:12 -0.19931712
5 6-9-98 19:24 0.08987583
6 6-27-98 17:00 -1.31900056
7 7-3-98 09:48 -2.15590363
8 7-8-98 02:36 -0.06949902
9 7-8-98 07:24 -0.19397886
10 7-23-98 21:48 0.09826620
> tsx <- timeSeries(data=tempDF$x, positions=tempDF$date)
> aggregate(tsx,by='months',FUN=mean)
Positions 1
5-1-98 00:00 -0.5145362
6-1-98 00:00 -0.4761473
7-1-98 00:00 -0.5802788
> aggregate(tsx,by='months',adj=1,FUN=mean)
Positions 1
6-1-98 00:00 -0.5145362
7-1-98 00:00 -0.4761473
8-1-98 00:00 -0.5802788
See help(aggregateSeries) for more information.
But I was wrong about timeSeries having trouble with non-unique
positions. Look at this (using your tempDF):
> foo <- timeSeries(data=tempDF$x,pos=tempDF$date)
> foo
Positions 1
5-9-98 0.47039043
5-9-98 -0.40612622
5-12-98 -1.60787282
6-7-98 -0.19931712
6-9-98 0.08987583
6-27-98 -1.31900056
7-3-98 -2.15590363
7-7-98 -0.06949902
7-8-98 -0.19397886
7-23-98 0.09826620
> tempDF$date@format <- "%m-%d-%02y
>%02H:%02M"
>
>tempDF
>
date x
1 5-9-98 17:00 0.47039043
2 5-9-98 17:00 -0.40612622
3 5-12-98 17:00 -1.60787282
4 6-7-98 17:00 -0.19931712
5 6-9-98 17:00 0.08987583
6 6-27-98 17:00 -1.31900056
7 7-3-98 17:00 -2.15590363
8 7-7-98 17:00 -0.06949902
9 7-8-98 17:00 -0.19397886
10 7-23-98 17:00 0.09826620
There are a lot of powerful timeSeries related functions, not all of
which are easy to understand. align() is one that might be used to
create your factors, if you end up going about it that way. (I've
only dabbled with the align() function; that might have been an
optimistic statement)
#######################
Thanks again,
Dave Kane
--
David Kane
Geode Capital Management
617-563-0122
david.d.kane@fmr.com
|