Hello,
thanks to Anne E. York, Barnali Das, James Holtman, Don MacQueen, Nick
Locantore, Chuck Taylor, Bert Gunter, and Sam Buttrey
for their quick replies.
Two things come out of their responses.
1. apply should not be used with data frames, apply was designed to work with
arrays, not lists; I find this a little confusing, data
frames, at least whith columns all the same length will return T when
tested with is.array and again the documentation for array
implies that the data should all be of the same mode, not a requirement for
data frames. Since data frames are lists lapply or
sapply are more appropriate. For exemple:
> apply(ttt, 2, mode)
penet n success pull z sign
"character" "character" "character" "character" "character" "character"
> lapply(ttt, mode)
$penet:
[1] "numeric"
$n:
[1] "numeric"
$success:
[1] "numeric"
$pull:
[1] "numeric"
$z:
[1] "numeric"
$sign:
[1] "character"
> sapply(ttt, mode)
penet n success pull z sign
"numeric" "numeric" "numeric" "numeric" "numeric" "character"
To confirm this we can try:
> sum(anc.sample.binpda[, 'penet'])
[1] 2700
> sum(anc.sample.binpda[, 'sign'])
Error in sum(anc.sample.binpda[, "sign"]): Numeric summary undefined for mode "c
haracter"
2. - As expected the 'penet' column is numeric, even if apply says it's not and
the 'sign' column is character.
But Bert Gunter suggests that "mode" may not be the right function to
use, in this case, with lapply or sapply, since mode will
return "numeric" for factors, "is.numeric" is more appropriate since it
will return F for factors. In the preceeding exemple "mode"
was OK since, at Bert's suggestion, I changed the "ifelse(....)" to
"I(ifelse(....))" to prevent S+ automatic conversion of character
data to factors in data frames.
Thanks again to all who responded,
Gérald Jean
Analyste-conseil (statistiques), Actuariat
télephone : (418) 835-4900 poste (7639)
télecopieur : (418) 835-6657
courrier électronique: gerald.jean@spgdag.ca
"In God we trust all others must bring data"
|