Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rolling Median with subset building over time

Tags:

r

median

I would like to compute a variant of rolling medians on my dataset that does build the subsets not by going k observerations to the front and back, but by taking all observations into account that are in a given time window.

A straightforward implemtation could look like this:

windowwidth <- 30
median.window <- function(x) median(mydata[time <= x + windowwidth /2 & time >= x - windowwidth /2)
vapply(time, median.window)

However, as you can imagine, this is not very efficient for large datasets. Do you see a possible improvement or a package providing an optimized implementation? You can not expect the observations be distributed equally over time.

zoo provides rollmedian, but this function does not offer to choose the winwod based on time but on the observation count.

like image 476
Thilo Avatar asked Nov 27 '25 15:11

Thilo


1 Answers

Ok, try this:

Rgames: timeseq<-1:5 
Rgames: winmat <- outer(timeseq,timeseq,FUN=function(x,y) y>=x &y<=x+2) 
Rgames: winmat 
      [,1]  [,2]  [,3]  [,4]  [,5] 
[1,]  TRUE  TRUE  TRUE FALSE FALSE 
[2,] FALSE  TRUE  TRUE  TRUE FALSE 
[3,] FALSE FALSE  TRUE  TRUE  TRUE 
[4,] FALSE FALSE FALSE  TRUE  TRUE 
[5,] FALSE FALSE FALSE FALSE  TRUE 
Rgames: winmat %*% timeseq 
     [,1] 
[1,]    6 
[2,]    9 
[3,]   12 
[4,]    9 
[5,]    5 

Replace that function with your window width and I think you'll be all set.
Edit: In respons to Thilo's query, it looks like in the general case you should use apply. Given the stuff above, call your observation values "timval", as

Rgames: timval<-c(3,4,2,6,1)
Rgames: valmat<-timval*t(winmat)
Rgames: valmat
     [,1] [,2] [,3] [,4] [,5]
[1,]    3    0    0    0    0
[2,]    4    4    0    0    0
[3,]    2    2    2    0    0
[4,]    0    6    6    6    0
[5,]    0    0    1    1    1
Rgames: apply(valmat,2,median)
[1] 2 2 1 0 0

Edit again: clearly I was asleep there: nobody wants a median based on all those zeroes. I should think more before posting. Add this:

valmat[valmat==0]<- NA
apply(valmat,2, median, na.rm=T)
[1] 3.0 4.0 2.0 3.5 1.0

And I'm sure there's a cleaner way of 'building' valmat than this, but the final result is the "filter matrix" you want to apply any function to.

like image 98
Carl Witthoft Avatar answered Nov 29 '25 11:11

Carl Witthoft