s-news
[Top] [All Lists]

Re: apply

To: v.moreno@ico.scs.es
Subject: Re: apply
From: Spencer Graves <spencer.graves@pdf.com>
Date: Sun, 26 Oct 2003 10:26:56 -0800
Cc: s-news@lists.biostat.wustl.edu
In-reply-to: <200310241335.09585.v.moreno@ico.scs.es>
References: <200310241335.09585.v.moreno@ico.scs.es>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
The documentation for "apply" says it, "Returns a vector or array by applying a specified function to sections of an array." A data.frame is not an array: It is a list with a dim attribute. However, in both S-Plus 6.1 and R 1.8, I got the following:
> DF <- data.frame(A = c(1, NA), B = 1:2)
> apply(DF, 2, function(x)
sum(is.na(x)))
A B
1 0

In this example, the data.frame DF was converted to a numeric array and then processed, producing the anticipated answer. However, with a mixture of numbers and letters, I get something that seems to match your description:
> DF2 <- data.frame(A = c(1, NA), B = c("a", NA))
> apply(DF2, 2, function(x)
sum(is.na(x)))
A B
0 0

This happens, because apply converts DF2 into a character array and then processes it. S-Plus does not allow NAs in character vectors or factors and so produces the 0's we see here. Consider the following:
> as.matrix(DF2)
    A    B
1 " 1" "a"
2 "NA" "NA"

Meanwhile, the same command in R produces the following: A B 1 " 1" "a"
2 NA   NA

In R, we get honest NA's, not character strings. The following examples produces the same answers for me in both S-Plus and R:
DF <- data.frame(A=c(1, NA), B=1:2)
apply(DF, 2, function(x)sum(is.na(x) | (x == "NA")))

DF2 <- data.frame(A=c(1, NA), B=c("a", NA))
apply(DF2, 2, function(x)sum(is.na(x)| (x == "NA")))

hope this helps.  spencer graves

Victor Moreno wrote:

I want to count the number of missing cases for each variable in a data.frame

apply( data,2,function(x)sum(is.na(x))) returns always 0
lapply( data,function(x)sum(is.na(x))) returns a list with correct answers
sapply( data,function(x)sum(is.na(x))) returns a vector with correct answers

Why is apply not working in this situation?



<Prev in Thread] Current Thread [Next in Thread>