Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lapply is very slow, is there a faster way?

Tags:

r

lapply

I have a rather large dataset organized in a list like so:

set.seed(0)
v <- rnorm(5000)
names(v) <- seq(1001, 6000, 1)
dates <- seq.Date(as.Date('2023-01-01'), by='day', length.out=365)
ls <- list()
ls <- sapply(dates, function(d) {ls[[length(ls) + 1]] <- v; ls})
names(ls) <- dates
str(ls[1:5])

List of 5
 $ 2023-01-01: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
  ..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
 $ 2023-01-02: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
  ..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
 $ 2023-01-03: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
  ..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
 $ 2023-01-04: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
  ..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
 $ 2023-01-05: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
  ..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ... 

As you can see, this is 5000 data points for each day of the year totaling 1,825,000 data points. Let's call them "x". I want to perform the following operation on EACH data point, x: max(c(0.5 - x, 0)). The following code works but takes a really long time.

new <- sapply(names(ls), function(d)
          lapply(names(v), function(n) max(c(0.5 - ls[[d]][n], 0))))
rownames(new) <- names(v)
new[1:5, 1:5]

     2023-01-01 2023-01-02 2023-01-03 2023-01-04 2023-01-05
1001 0          0          0          0          0         
1002 0.8262334  0.8262334  0.8262334  0.8262334  0.8262334 
1003 0          0          0          0          0         
1004 0          0          0          0          0         
1005 0.08535857 0.08535857 0.08535857 0.08535857 0.08535857

Is there a faster way?

like image 212
WishIaskedThisSooner Avatar asked Oct 25 '25 05:10

WishIaskedThisSooner


2 Answers

Instead of applying lapply, use pmax

new <- sapply(ls, \(v) pmax(0.5 - v, 0))
like image 159
Hugh Avatar answered Oct 26 '25 20:10

Hugh


You can try:

m <- 0.5 - do.call(cbind, ls)
m[m < 0] <- 0

Which gives:

head(m, c(5,5))

     2023-01-01 2023-01-02 2023-01-03 2023-01-04 2023-01-05
1001 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
1002 0.82623336 0.82623336 0.82623336 0.82623336 0.82623336
1003 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
1004 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
1005 0.08535857 0.08535857 0.08535857 0.08535857 0.08535857

Benchmarks:

bench::mark(a = sapply(ls, \(v) pmax(0.5 - v, 0)),
            b = {
              m <- do.call(cbind, ls)
              pmax(0.5 - m, 0)
            },
            c = {
              m <- 0.5 - do.call(cbind, ls)
              m[m < 0] <- 0
              m
            })

# A tibble: 3 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result   memory             time       gc      
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>   <list>             <list>     <list>  
1 a           247.6ms    248ms      4.04     305MB     4.04     1     1      248ms <dbl[…]> <Rprofmem>         <bench_tm> <tibble>
2 b           113.1ms    114ms      8.77     229MB     0        5     0      570ms <dbl[…]> <Rprofmem [4 × 3]> <bench_tm> <tibble>
3 c            99.5ms    106ms      9.16     165MB     0        5     0      546ms <dbl[…]> <Rprofmem [5 × 3]> <bench_tm> <tibble>
like image 40
Ritchie Sacramento Avatar answered Oct 26 '25 19:10

Ritchie Sacramento



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!