The documentation for "apply" says it, "Returns a vector or array
by applying a specified function to sections of an array." A data.frame
is not an array: It is a list with a dim attribute.
However, in both S-Plus 6.1 and R 1.8, I got the following:
> DF <- data.frame(A = c(1, NA), B = 1:2)
> apply(DF, 2, function(x)
sum(is.na(x)))
A B
1 0
In this example, the data.frame DF was converted to a numeric
array and then processed, producing the anticipated answer.
However, with a mixture of numbers and letters, I get something
that seems to match your description:
> DF2 <- data.frame(A = c(1, NA), B = c("a", NA))
> apply(DF2, 2, function(x)
sum(is.na(x)))
A B
0 0
This happens, because apply converts DF2 into a character array
and then processes it. S-Plus does not allow NAs in character vectors
or factors and so produces the 0's we see here. Consider the following:
> as.matrix(DF2)
A B
1 " 1" "a"
2 "NA" "NA"
Meanwhile, the same command in R produces the following:
A B
1 " 1" "a"
2 NA NA
In R, we get honest NA's, not character strings.
The following examples produces the same answers for me in both
S-Plus and R:
DF <- data.frame(A=c(1, NA), B=1:2)
apply(DF, 2, function(x)sum(is.na(x) | (x == "NA")))
DF2 <- data.frame(A=c(1, NA), B=c("a", NA))
apply(DF2, 2, function(x)sum(is.na(x)| (x == "NA")))
hope this helps. spencer graves
Victor Moreno wrote:
I want to count the number of missing cases for each variable in a data.frame
apply( data,2,function(x)sum(is.na(x))) returns always 0
lapply( data,function(x)sum(is.na(x))) returns a list with correct answers
sapply( data,function(x)sum(is.na(x))) returns a vector with correct answers
Why is apply not working in this situation?
|