Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using apply to replace nested for loop

My goal is to go through various signals and ignore any 1's that are not part of a series (minimum of at least two 1's in a row). The data is an xts time series with 180K+ columns and 84 months. I've provided a small simplified data set I've used a nest for loop, but it's taking way too long to finish on the entire data set. It works but is horribly inefficient.

I know there's some way to use an apply function, but I can't figure it out.

Example data:

    mod_sig <- data.frame(a = c(0,1,0,0,0,1,1,0,0,0,1,0,1,1), 
                          b = c(0,0,1,0,0,1,0,0,0,1,1,1,1,1), 
                          c = c(0,1,0,1,0,1,1,1,0,0,0,1,1,0), 
                          d = c(0,1,1,1,0,1,1,0,0,1,1,1,1,1),
                          e = c(0,0,0,0,0,0,0,0,0,0,1,0,0,0))

    mod_sig <- xts(mod_sig, order.by = as.Date(seq(as.Date("2016-01-01"), as.Date("2017-02-01"), by = "month")))

Example code:

   # fixing months where condition is only met for one month
   # creating a new data frame for modified signals
   Signals_Fin <- data.frame(matrix(nrow = nrow(mod_sig), ncol = ncol(mod_sig)))
   colnames(Signals_Fin) <- colnames(mod_sig)

   # Loop over Signals to change 1's to 0's for one month events
   for(col in 1:ncol(mod_sig)) {
     for(row in 1:nrow(mod_sig)) {
       val <- ifelse(mod_sig[row,col] == 1, 
                     ifelse(mod_sig[row-1,col] == 0, 
                            ifelse(mod_sig[row+1,col] == 0,0,1),1),0)
       Signals_Fin[row, col] <- val
     }
   }

As you can see with the loop, any 1's that aren't in a sequence are changed to 0's. I know there is a better way, so I'm hoping to improve my approach. Any insights would be greatly appreciated. Thanks!

Answer from Zack and Ryan:

Zack and Ryan were spot on with dyplr, I only made slight modifications based off what was given and some colleague help.

Answer code:

    mod_sig <- data.frame(a = c(0,1,0,0,0,1,1,0,0,0,1,0,1,1), 
                      b = c(0,0,1,0,0,1,0,0,0,1,1,1,1,1), 
                      c = c(0,1,0,1,0,1,1,1,0,0,0,1,1,0), 
                      d = c(0,1,1,1,0,1,1,0,0,1,1,1,1,1),
                      e = c(0,0,0,0,0,0,0,0,0,0,1,0,0,0))

    Signals_fin = mod_sig %>% 
                  mutate_all(funs(ifelse((. == 1 & (lag(.) == 1 | lead(.) == 1)),1,0))) %>% 
                  mutate_all(funs(ifelse(is.na(.), 0, .)))


    Signals_fin <- xts(Signals_fin, order.by = as.Date(seq(as.Date("2016-01-01"), as.Date("2017-02-01"), by = "month")))
like image 837
Thee_Analyst Avatar asked Feb 20 '26 02:02

Thee_Analyst


1 Answers

here's a stab from a dplyr perspective, I converted your row_names to a column but you can just as easily convert them back to rownames with tibble::column_to_rownames():

library(dplyr)
library(tibble)

mod_sig %>%
  as.data.frame() %>%
  rownames_to_column('months') %>%
  mutate_at(vars(-months), function(x){
    if_else(x == 1 & 
              (lag(x, order_by = .$months) == 1 | 
                 lead(x, order_by = .$months) == 1),
            1,
            0)
  })

As suggested by @Ryan, his mutate_at call is more elegant, it's important everything is already sorted, though:

mod_sig %>%
  as.data.frame() %>%
  rownames_to_column('months') %>%
  mutate_at(vars(-months),  ~ as.numeric(.x & (lag(.x) | lead(.x))))

And to build on his suggestion:

mod_sig %>%
  as.data.frame() %>%
  mutate_all(~ as.numeric(.x & (lag(.x) | lead(.x))))
like image 132
zack Avatar answered Feb 22 '26 17:02

zack



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!