s-news
[Top] [All Lists]

Re: array with NA values

To: Eric yang <yang_eric9@yahoo.com>
Subject: Re: array with NA values
From: Tim Hesterberg <timh@insightful.com>
Date: 7 Apr 2006 14:46:50 -0700
Cc: s-news@lists.biostat.wustl.edu
In-reply-to: <20060407115948.43566.qmail@web33911.mail.mud.yahoo.com> (message from Eric yang on Fri, 7 Apr 2006 04:59:48 -0700 (PDT))
References: <20060407115948.43566.qmail@web33911.mail.mud.yahoo.com>
The most efficient way depends on what you are doing.

Some tools you might consider:
* rowMeans and colMeans.  These allow you to specify a "dims" argument
  (the number of dimensions to treat as "row" dimensions)
* colMedians.   No dims argument, but you could temporarily convert a 3-d
  array to a matrix.
* groupMeans and subtractMeans, from the resample library (see bottom).
* combination of subscript replacement and rep, e.g. for replacement
  v = vector of counts of non-missing values you will replace
  x[ !is.na(x) ] <- rep( summary across the groups, v)
  or to subtract a summary:
  y <- x
  y[ !is.na(x) ] <- y[ !is.na(x) ] - rep( summary across the groups, v)

Hope this helps.

Tim Hesterberg

P.S.  I'm giving an advanced programming course in May, see bottom.
Among other things, we learn efficient ways to do a wide variety
of operations.

>Thanks for the email Tim.
>
>I forgot to mention that at the end of performing the calculations on the 
>non-missing elements, I need to construct the original array which will be 
>populated with the new numbers and all missing values remain the same.
>
>Any help is appreciated.
>
>
>Tim Hesterberg <timh@insightful.com> wrote: My apologies; I used my own 
>function "omit.na" in the example below.
>Thanks to Chris Barker for noticing this.
>
>omit.na <- function(x) x[!is.na(x)]
>
>Tim Hesterberg
>
>
>>If you are doing many operations with the same data, you might
>>want to convert your data into a list with missing values omitted.
>>Each element of the list would be a vector containing non-missing values.
>>Optionally, you can give the list dimensions.
>>
>>x <- array(1:1000, c(10,10,10))
>>set.seed(0)
>>x[runif(1000) > .4] <- NA
>>
>>temp <- x
>>dim(temp) <-  c(10, prod(dim(x)[2:3]))
>>x2 <- lapply(1:ncol(temp), function(j, x) omit.na(x[,j]), x = temp)
>>x2  # list of length 100, containing non-missing values for each
>>
>># Now can turn x2 into a matrix/list hybrid if you like,
>># to let you use apply on rows or columns
>>dim(x2) <- dim(x)[2:3]
>>
>># example using apply:
>>apply(x2, 2, function(x) length(unlist(x)))
>>
>>Tim Hesterberg
>>
>>>Dear all,
>>>
>>>I have numerous 3-dimenional arrays and each array contains no entire row or 
>>>column with NA values. However, 50-60% of the values in the array are NA 
>>>values. I perform various operations on the array using the apply statement. 
>>>However, because the arrays are quite large the calculations take a while to 
>>>finish. Within the apply statement I reduce the number of calculations on 
>>>the vector by using the !is.na() statement. Is there a faster way to run the 
>>>apply command without the need to read in an enitire vector from the array 
>>>which contains a large amount of NA values, i.e. should I store the data in 
>>>another format to avoid having an array with a large amount of NA values.
>>>
>>>For example, if I have a 3-dimensional 1000x1000x1000 and use the apply 
>>>statement on dimensions 2 and 3, I'm reading in vectors of length 1000 in 
>>>the apply statement and most of the 1000 values might be NAs. Can this be 
>>>avoided?
>>>
>>>I would greatly appreciate any help on this.
>>>Dave

========================================================
| Tim Hesterberg       Research Scientist              |
| timh@insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Download the S+Resample library from www.insightful.com/downloads/libraries

Training courses I'll give, near Seattle, May 15-18
    Advanced Programming in S-PLUS
        http://www.insightful.com/services/scheduledetail.asp?SID=201
    Bootstrap Methods
        http://www.insightful.com/services/scheduledetail.asp?SID=202


<Prev in Thread] Current Thread [Next in Thread>