>I want to count the number of missing cases for each variable in a data.frame
>
>apply( data,2,function(x)sum(is.na(x))) returns always 0
>lapply( data,function(x)sum(is.na(x))) returns a list with correct answers
>sapply( data,function(x)sum(is.na(x))) returns a vector with correct answers
>
>Why is apply not working in this situation?
As Spencer Graves pointed out, because if there are character or factor
columns, the data frame is converted to a character matrix.
To avoid this, handle the data frame as a list, and use sapply.
Two alternatives are:
sapply(data, function(x) sum(is.na(x)))
sapply(data, function(x) length(which.na(x)))
The latter should be more efficient.
My favorite:
library(missing) # do this once in a session
sapply(data, numberMissing)
Tim Hesterberg
========================================================
| Tim Hesterberg Research Scientist |
| timh@insightful.com Insightful Corp. |
| (206)802-2319 1700 Westlake Ave. N, Suite 500 |
| (206)802-2500 (fax) Seattle, WA 98109-3044, U.S.A. |
| www.insightful.com/Hesterberg |
========================================================
|