dplyr
has the vectorized conditionals if_else
and case_when
.
However, both of these eagerly evaluate their possible outputs (true/false
for if_else
, the RHS of the formula for case_when
):
suppressPackageStartupMessages({
library(dplyr)
})
if_else(c(T, T, T), print(1), print(2))
#> [1] 1
#> [1] 2
#> [1] 1 1 1
case_when(
c(T, T, T) ~ print(1),
c(F, F, F) ~ print(2)
)
#> [1] 1
#> [1] 2
#> [1] 1 1 1
Created on 2020-02-05 by the reprex package (v0.3.0)
Here we can obviously see the false
cases are evaluated even though they're never used. I'm looking for a way to avoid this since my
Is there an alternative which doesn't do this?
I'm aware, one alternative is actually base::ifelse
:
ifelse(c(T, T, T), print(1), print(2))
#> [1] 1
#> [1] 1 1 1
However base::ifelse
is notoriously inefficient, so a better alternative would be nice. That being said, I'm especially interested in alternatives for case_when
, which I use quite a bit when I'd otherwise need to use a chain of ifelse
s.
I've already looked at data.table::fifelse
, but it suffers from the same problem:
suppressPackageStartupMessages({
library(data.table)
})
fifelse(c(T, T, T), print(1), print(2))
#> [1] 1
#> [1] 2
#> [1] 1 1 1
So, is there an alternative for if_else
and case_when
which doesn't eagerly evaluate its unused cases?
If you install the development version of data.table
from GitHub you can use fcase
which is similar to dplyr::case_when
but with lazy evaluation:
data.table::fcase(c(TRUE, TRUE, TRUE), print(1L), c(FALSE, FALSE, FALSE), print(2L))
[1] 1
[1] 1 1 1
You could just rely on native R's lazy evaluation of parameter passing and use all
to screen for cases when FALSE
isn't present:
lazy_if_else <- function(logical_test, value_if_true, value_if_false)
{
if(all(logical_test)) return(rep(value_if_true, length.out = length(logical_test)))
if_else(logical_test, value_if_true, value_if_false)
}
This out-performs ifelse
and if_else
microbenchmark::microbenchmark(ifelse(c(T, T, T), 0, Sys.sleep(0.1)),
if_else(c(T, T, T), 0, Sys.sleep(0.1)),
lazy_if_else(c(T, T, T), 0, Sys.sleep(0.1)))
#> Unit: microseconds
#> expr min lq mean
#> ifelse(c(T, T, T), 0, Sys.sleep(0.1)) 12.662 13.689 25.47675
#> if_else(c(T, T, T), 0, Sys.sleep(0.1)) 102723.054 109145.897 109678.33523
#> lazy_if_else(c(T, T, T), 0, Sys.sleep(0.1)) 4.791 5.476 10.80378
#> median uq max neval cld
#> 15.3995 34.904 74.255 100 a
#> 110036.0945 110176.049 116619.936 100 b
#> 6.5030 16.768 26.008 100 a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With