s-news
[Top] [All Lists]

Problem with NA, by and sapply

To: S-News <s-news@lists.biostat.wustl.edu>
Subject: Problem with NA, by and sapply
From: Tristan Lorino <tristan.lorino@lcpc.fr>
Date: Thu, 19 Oct 2006 10:16:46 +0200
Organization: LCPC
Reply-to: Tristan Lorino <tristan.lorino@lcpc.fr>
Hi, I have a problem with sapply, by and NA's. Here is an example...

> x_data.frame(IdS=c(1,1,1,2,2,2),
+ Age=c(1,3,4,2,4,6),
+ CFT=c(.2,.5,.7,.3,.5,.6))
> tau_.6
> x
  IdS Age CFT 
1   1   1 0.2
2   1   3 0.5
3   1   4 0.7
4   2   2 0.3
5   2   4 0.5
6   2   6 0.6


IdS is a coding variable for subject (here only 2 subjects, but in the final
file there is about 5,000 subjects), tau is a threshold  (varying from
.1 to .9 by .1).
I would like a file with the IdS variable and, for each value of IdS, two ages 
(Age1
and Age2) defined as following :
   1.1 Age1= age at which max(CFT | CFT<= to tau) is reached;
   2.1 Age2 = age at which min(CFT | CFT >= tau) is reached;
   3.1 if undefined, NA.

For tau=.6, I would like to have:
  IdS Age1     Age2
1   1   3      4
2   2   6      6

For tau=.1, I would like to have:
  IdS Age1     Age2
1   1   NA      1
2   2   NA      2

For tau=.7, I would like to have:
  IdS Age1     Age2
1   1   4      4
2   2   6      NA

If I write:
>apply(by(x,x$IdS, function(x) 
>x[x$CFT==max(x[x$CFT<=tau,"CFT"]),"Age"]),function(x) x,simplify=T)
>sapply(by(x,x$IdS, function(x) 
>x[x$CFT==min(x[x$CFT>=tau,"CFT"]),"Age"]),function(x) x,simplify=T)
it works well, except when NA's coming:

for tau=.6 (OK) :
> sapply(by(x, x$IdS, function(x)
x[x$CFT == max(x[x$CFT <= tau, "CFT"]), "Age"]), function(x)
x, simplify = T)
[1] 3 6
> sapply(by(x, x$IdS, function(x)
x[x$CFT == min(x[x$CFT >= tau, "CFT"]), c("Age")]), function(x)
x, simplify = T)
[1] 4 6

for tau=.7:
> sapply(by(x, x$IdS, function(x)
x[x$CFT == max(x[x$CFT <= tau, "CFT"]), c("Age")]), function(x)
x, simplify = T)
[1] 4 6
> sapply(by(x, x$IdS, function(x)
x[x$CFT == min(x[x$CFT >= tau, "CFT"]), c("Age")]), function(x)
x, simplify = T)
[[1]]:
[1] 4

[[2]]:
[1] NA NA NA

I need to avoid loops, because of the 5,000 subjects (and sometimes 10
observations per subject) and the 9 thresholds...

Thank you,
Tristan


-- 
Laboratoire Central des Ponts et Chaussées
[Division ESAR ? Section AGR]
Route de Bouaye BP 4129
44341 Bouguenais Cedex
France
Tél 33 (0)2 40 84 56 18
Fax 33 (0)2 40 84 59 92


<Prev in Thread] Current Thread [Next in Thread>
  • Problem with NA, by and sapply, Tristan Lorino <=