Take the following simple function:
fun <- function(a, b, c, d, e) {
stopifnot("Input you provide must be equal length." = length(a) == length(b) && length(b) == length(c) && length(c) == length(d) && length(d) == length(e))
result <- (a + b / c + d) / sqrt(e)
result2 <- a/result
return(data.frame(result = result, result2 = result2, a = a, b = b, c = c, d = d, e = e))
}
Now, if I want to map over a look-up table of all combinations of input values, I could do the following, e.g., using purrr
functionals:
library(purrr)
df <- expand.grid(a = 1:1000, b = c(1, 2, 3, 4, 5), c = 7, d = 3, e = 5)
out <- pmap_df(d, fun)
However, even for the relatively simple case of one larger and a smaller vector (in my application case, this would be the most common case though), this is pretty slow.
Unit: seconds
min lq mean median uq max neval
2.235245 2.235245 2.235245 2.235245 2.235245 2.235245 1
How to speed this up, especially for the simple case sketched above? Of course, as df
gets larger and larger, things will slow down.
I cannot say my solution is fastest, but it is faster indeed. You can try the code below
do.call(fun, df)
and the benchmarking
df <- expand.grid(a = 1:1000, b = c(1, 2, 3, 4, 5), c = 7, d = 3, e = 5)
f_Rob <- function() pmap_df(df, function(a, b, c, d, e) fun(a = a, b = b, c = c, d = d, e = e))
f_TIC <- function() do.call(fun, df)
microbenchmark(
f_Rob(),
f_TIC(),
unit = "relative",
check = "equivalent",
times = 10
)
and you will see
Unit: relative
expr min lq mean median uq max neval
f_Rob() 1074.886 1049.034 441.6319 854.2739 620.4029 92.29739 10
f_TIC() 1.000 1.000 1.0000 1.0000 1.0000 1.00000 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With