If you have the finmetrics module, there are couple of useful vectorised
functions that will run fast on fairly large data sets. First is to use tslag()
function, which can be used within the ifelse(), but you will need to convert
the data.frame into a matrix and then back if you wish to make ifelse work on
the entire table, e.g. if your original data was in a data frame called MyRows,
then you can do something like:
MyRows.mat <- as.matrix(MyRows)
MyRows.mat <- ifelse(tslag(MyRows.mat)==1 & tslag(MyRows.mat,-1)==1 &
is.na(MyRows),1,MyRows.mat)
MyRows.mat <- ifelse(tslag(MyRows.mat)==0 & tslag(MyRows.mat,-1)==0 &
is.na(MyRows),0,MyRows.mat)
MyRows.df <- data.frame(MyRows.mat) # convert back to data.frame
colIds(MyRows.df) <- colIds(MyRows) # replace original column names
A more efficient alternative (also needing finmetrics) is to use the interpNA
function with linear interpolation -- since the only way linear interpolation
will yield 0.5 is if the NA was between 1 and 0 (or vice versa), but will be
zero if NA is between zeros and 1 if between, as you would like, you can use
the trick of using linear interpolation and then replace 0.5 values with NAs,
e.g.
MyRows.df <- MyRows
MyRows.df <- interpNA(MyRows,method="linear")
MyRows <- ifelse(MyRows.df==0.5,NA,MyRows.df) # overwrite the original data
This should be very fast even if you have large number of rows and columns.
-----Original Message-----
From: s-news-owner@lists.biostat.wustl.edu on behalf of El Imam Hanan Attia Rizk
Sent: Tue 10/30/2007 8:36 PM
To: s-news@lists.biostat.wustl.edu
Subject: [S] replacement of character in column by values
Dear S Plus user
I am working in a big data, I have two data frames that I did bind by rows
(rbind) and I gut these columns(A,B,C,D)
A B C D
0 0 0 0
NA NA NA NA
NA NA NA NA
0 0 0 0
1 1 0 1
NA NA NA NA
1 1 0 1
I would like to go through each column and replace NA by 0 if it is between 2
zeros, and if it is between two ones replace it by 1
I would really appreciate if any of you have the chance to give me some
suggestions.
Hanan Elimam
Ph.D student, Pharmaceutical Sciences
Université de Montréal
Faculté de pharmacie
Pavillon Jean-Coutu, bureau 3173
2940 Chemin de la polytechnique
Montréal, H3T 1J4
tél.: 514-343-6111, poste 0388
FAX: 514-343-7073
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
Please access the attached hyperlink for an important electronic communications
disclaimer: http://www.lse.ac.uk/collections/secretariat/legal/disclaimer.htm
|