s-news
[Top] [All Lists]

Summary about unpaste

To: <s-news@lists.biostat.wustl.edu>
Subject: Summary about unpaste
From: Stefano Sofia <stefano.sofia@usa.net>
Date: Tue, 23 Sep 2003 21:05:34 +0100
Dear Splus users,
Sorry for the delay of my summary about a code involving unpaste.

Many thanks to Matt Calder, James Holtman, Tom Burr, Nick Ellis, David L
Lorenz and Michael Camilleri.
All of you gave useful replies. In particular I tried Tom Burr’s and Nick
Ellis’ hints, and they work properly (I don’t have Splus 6 so the command
%in% is not available).
Thank you again

Here below there is my original question and all the contributes.

Original Question:
I am dealing with a daily timeseries. This is a dataframe formed by two
columns; in the first column there is the date (dd/mm/yyyy), in the second
there is the data. This time series is not a time series object.
For managing the date the command unpaste is simple but effective.
My problem is: I need to take in consideration only some intervals of my
timeseries, for example from 1750 to 1751;
the line

mytimeseries[unpaste(mytimeseries [[1]], sep = "/")[[3]] == c(1750:1751), ]

creates a dataframe with 365 rows (and two columns), taking in consideration
half days of the year 1750 and half days of the year 1751; I would have
expected a dataframe with 365 by two rows.
Whatever interval is taken (also from 1750 to 1780, for example) always only
365 rows are created. Why? Where is the mistake?
How can I solve this problem without loosing any single data?

Matt Calder:

You might have better luck using the substring function. The unpaste
function returns a list and can vary in size/length. The substring function
always returns a string and given your fixed format dates should be easy to
use.
Also, working with dates is easier if you drop the '/', so code them as
yyyymmdd. Just a suggestion.

Matt

Tom Burr:

none of the years can equal c(1750:1751) so if you look at the result of your
attempt, it should be
very different from what you want.

try

mytimeseries[match(unpaste(mytimeseries [[1]], sep =
"/")[[3]],c(1750:1751),nomatch=0),]

also -- you mighr prefer julian and month.day.year so can work with integers
as
dates, then month.day.year to get year.

Nick Ellis:

mytimeseries[unpaste(mytimeseries [[1]], sep = "/")[[3]] == c(1750:1751), ]

should have been 

mytimeseries[unpaste(mytimeseries [[1]], sep = "/")[[3]] == 1750 |
unpaste(mytimeseries [[1]], sep = "/")[[3]] == 1751), ]

or

mytimeseries[is.element(unpaste(mytimeseries [[1]], sep = "/")[[3]],
1750:1751),]

or simply 

mytimeseries[unpaste(mytimeseries [[1]], sep = "/")[[3]] %in% 1750:1751, ]


Your version invokes the 'recycling rule' in which the shorter vector
(1750:1751) is repeated until it equals the length of the longer vector
(unpaste(mytimeseries [[1]], sep = "/")[[3]]). Here are some experiments with
the recycling rule.

> 1:10 == 1:2
[1] T T F F F F F F F F
> x <- rep(1:3,4)
> x == 1:2
[1] T T F F F F T T F F F F
> x == 1:3
[1] T T T T T T T T T T T T
> x == 1:4
[1] T T T F F F F F F F F F
> x == 1:6
[1] T T T F F F T T T F F F
> x == 1
[1] T F F T F F T F F T F F
> x
[1] 1 2 3 1 2 3 1 2 3 1 2 3

David Lorenz:

Replace the == with
%in% (in S-PLUS 6.1) or use the is.element function in a previous version
of S-PLUS. The == operation works only with an atomic value, %in% and
is.element work with vectors.
Dave


Michael Camilleri:

It may be in your comparison operator. 

mytimeseries[unpaste(mytimeseries [[1]], sep = "/")[[3]] == c(1750:1751), ]

Your expressions likely expands c(1750:1751) to a vector
(c(1750,1751,1750,1751,....) to the same length as mytimeseries[[1]], so you
are not doing a full comparison.

Try

mytimeseries[is.element(unpaste(mytimeseries [[1]], sep = "/")[[3]],
c(1750:1751)), ]

which tests to see if each element is in the set c(1750,1751), and returns
true if it is. Also, NAs return F, which is handy if you have missing data.

Alternatively, your could convert the date in dd/mm/yyyy format to a date
object, and calculate the year e.g. years(dates(mytimeseries[[1]])), and
then do the comparison.

Hope that helps
Michael Camilleri




<Prev in Thread] Current Thread [Next in Thread>
  • Summary about unpaste, Stefano Sofia <=