s-news
[Top] [All Lists]

Re: rolling median

To: s-news@wubios.wustl.edu, mleeds@mlp.com
Subject: Re: rolling median
From: Gernot Schmidt <gernot@asrc.cestm.albany.edu>
Date: Thu, 29 Jan 2004 16:24:50 -0500 (EST)
Reply-to: Gernot Schmidt <gernot@asrc.cestm.albany.edu>
Mark,

I wrote some functions to deal with rolling (or running, as I know it) 
averages. 
Unfortunately I have no idea how you can easily modify it to calculate the 
median. But in any case you can use it to calculate the standard deviation. The 
squared standard deviation (stddev(x))^2 of a set x is given by mean(x^2) - 
mean(x)^2. So you only have to use the function once for x and another time for 
x^2. 

My function delivers NA if any value in the current window is NA. Probably you 
want to avoid this and simply ignore NA's. In this case you have to set every 
NA 
value to 0 before you execute running.mean(x): x[is.na(x)] <- 0.
The running mean of x is then: x.mean <- running.mean(x)
But this will give a smaller value for the running mean than intended. 
Therefore 
a correction factor has to be applied to the resulting running mean. This 
correction factor only depends on the running mean of a series z of zeros and 
ones: z <- ifelse(is.na(x),1,0). 
Then calculate the running mean of z : z.mean <- running.mean(z)
Now the running mean of x can be corrected: x.mean <- x.mean/(1-z.mean)


---
running.mean <- function(data, win.size, centered = T, fillup.na = F)
{
        # written 5/1/03 by Gernot Schmidt
        # 
        # calculates running mean r_(k,i) of dataset s_i
        # k denotes window size
        #
        # r_(k,i) = 1/k *sum(s_(i+j-1),j=1..k)
        #
        # By setting 'centered' to T, it is possible to
        # calculate the running mean with the pivot point in the middle 
        # of the window. Otherwise the pivot is at the end of the window.
        #
        # By setting 'fillup.na' the initial row of NA at the beginning
        # of the result vector are replaced by the first valid average.
        #
        # Implementation: To avoid explicit looping over 
        # the whole dataset k lagged vectors (lagged by one) 
        # are summed and the sum then divided by k.
        #
        fillup.na <- fillup.na && centered
        n.orig <- length(data)
        if(centered) {
                # make sure that win.size is odd
                win.size <- win.size + (win.size + 1) %% 2
                # add (win.size %/% 2) NA to begin of data
                data <- c(rep(NA, win.size %/% 2), data)
                no.na <- win.size %/% 2
        }
        n <- length(data)
        z.sum <- 0
        for(i in 1:win.size)
                z.sum <- z.sum + c(data[i:n], rep(NA, i - 1))
        res <- z.sum/win.size
        if(fillup.na)
                res[1:no.na] <- res[no.na + 1]
        return(res[1:n.orig])
}

---
Gernot Schmidt
ASRC Albany



<Prev in Thread] Current Thread [Next in Thread>