Calculate extra values for moving average (and other functions)

Question

Suppose I have the following dataset:

library(dplyr)
library(zoo)

df <- data.frame(date = seq.Date(from = "2025-01-01", to = "2025-01-10"),
                 value = 1:10)

df
#>          date value
#> 1  2025-01-01     1
#> 2  2025-01-02     2
#> 3  2025-01-03     3
#> 4  2025-01-04     4
#> 5  2025-01-05     5
#> 6  2025-01-06     6
#> 7  2025-01-07     7
#> 8  2025-01-08     8
#> 9  2025-01-09     9
#> 10 2025-01-10    10

When I calculate the simple moving average for, let's say, the last 5 observations, this is what I get:

df |> 
  mutate(value_roll = rollapply(value, width = 5, FUN = mean, fill = NA, align = "right"))
#>          date value value_roll
#> 1  2025-01-01     1         NA
#> 2  2025-01-02     2         NA
#> 3  2025-01-03     3         NA
#> 4  2025-01-04     4         NA
#> 5  2025-01-05     5          3
#> 6  2025-01-06     6          4
#> 7  2025-01-07     7          5
#> 8  2025-01-08     8          6
#> 9  2025-01-09     9          7
#> 10 2025-01-10    10          8

As expected, the first 4 values are NA. However, for a simple moving average of order k, I'd like that the first k-1 values were the first simple moving average of order j-1, k = 1, ..., k-1. For example,

#>          date value   value_roll
#> 1  2025-01-01     1          1
#> 2  2025-01-02     2          1.5
#> 3  2025-01-03     3          2
#> 4  2025-01-04     4          2.5
#> 5  2025-01-05     5          3
#> 6  2025-01-06     6          4
#> 7  2025-01-07     7          5
#> 8  2025-01-08     8          6
#> 9  2025-01-09     9          7
#> 10 2025-01-10    10          8

Also, I want to be able to inform other functions for the argument FUN in the rollapply function, such as sum (i.e., the sum of the last k values) and sd (i.e., the standard deviation of the last k values), but in the same fashion as the moving average.

Is there a simple way to do it in R? I bet there is, but I couldn't come with any simple idea. They are all too complex for my taste.

Friede · Accepted Answer

From help(zoo::rollapply):

partial
logical or numeric. If FALSE (default) then FUN is only applied when all indexes of the rolling window are within the observed time range. If TRUE, then the subset of indexes that are in range are passed to FUN. A numeric argument to partial can be used to determin the minimal window size for partial computations. See below for more details.

e.g.

data.frame(date = seq.Date(from='2025-01-01', to='2025-01-10'), value = 1:10) |>
  transform(value_roll = zoo::rollapply(
    value, width=5, FUN=mean, fill=NA, align='right', partial=TRUE))

gives

         date value value_roll
1  2025-01-01     1        1.0
2  2025-01-02     2        1.5
3  2025-01-03     3        2.0
4  2025-01-04     4        2.5
5  2025-01-05     5        3.0
6  2025-01-06     6        4.0
7  2025-01-07     7        5.0
8  2025-01-08     8        6.0
9  2025-01-09     9        7.0
10 2025-01-10    10        8.0

Note

Edit to add session info, see comments below this answer.

> sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.2

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] zoo_1.8-14     compiler_4.5.0 tools_4.5.0    grid_4.5.0     lattice_0.22-6

margusl · Answer

You can create a "sub-frame" with slider::slide_dfr() with all your windowed calculations; mutate(), when called without a name, unnests it to columns:

tibble::tibble(date = seq.Date(from = "2025-01-01", to = "2025-01-10"), value = 1:10) |>
  dplyr::mutate(
    slider::slide_dfr(
      value, 
      .f = \(x) data.frame(mean_roll = mean(x), sum_roll = sum(x), sd_roll = sd(x)), 
      .before = 4
    )
  )
#> # A tibble: 10 × 5
#>    date       value mean_roll sum_roll sd_roll
#>    <date>     <int>     <dbl>    <int>   <dbl>
#>  1 2025-01-01     1       1          1  NA    
#>  2 2025-01-02     2       1.5        3   0.707
#>  3 2025-01-03     3       2          6   1    
#>  4 2025-01-04     4       2.5       10   1.29 
#>  5 2025-01-05     5       3         15   1.58 
#>  6 2025-01-06     6       4         20   1.58 
#>  7 2025-01-07     7       5         25   1.58 
#>  8 2025-01-08     8       6         30   1.58 
#>  9 2025-01-09     9       7         35   1.58 
#> 10 2025-01-10    10       8         40   1.58

Calculate extra values for moving average (and other functions)

Tags:

r

dplyr

moving-average

Marcus Nunes

2 Answers

Friede

margusl

Recent Activity

Donate For Us

Calculate extra values for moving average (and other functions)

Tags:

r

dplyr

moving-average

Marcus Nunes

2 Answers

Friede

margusl

Related questions

Recent Activity

Donate For Us