I have written a program that works with the 3n + 1 problem (aka "wondrous numbers" and various other things). But it has a double loop. How could I vectorize it?
the code is
count <- vector("numeric", 100000)
L <- length(count)
for (i in 1:L)
{
x <- i
while (x > 1)
{
if (round(x/2) == x/2)
{
x <- x/2
count[i] <- count[i] + 1
} else
{
x <- 3*x + 1
count[i] <- count[i] + 1
}
}
}
Thanks!
I turned this 'inside-out' by creating a vector x where the ith element is the value after each iteration of the algorithm. The result is relatively intelligible as
f1 <- function(L) {
x <- seq_len(L)
count <- integer(L)
while (any(i <- x > 1)) {
count[i] <- count[i] + 1L
x <- ifelse(round(x/2) == x/2, x / 2, 3 * x + 1) * i
}
count
}
This can be optimized to (a) track only those values still in play (via idx) and (b) avoid unnecessary operations, e.g., ifelse evaluates both arguments for all values of x, x/2 evaluated twice.
f2 <- function(L) {
idx <- x <- seq_len(L)
count <- integer(L)
while (length(x)) {
ix <- x > 1
x <- x[ix]
idx <- idx[ix]
count[idx] <- count[idx] + 1L
i <- as.logical(x %% 2)
x[i] <- 3 * x[i] + 1
i <- !i
x[i] <- x[i] / 2
}
count
}
with f0 the original function, I have
> L <- 10000
> system.time(ans0 <- f0(L))
user system elapsed
7.785 0.000 7.812
> system.time(ans1 <- f1(L))
user system elapsed
1.738 0.000 1.741
> identical(ans0, ans1)
[1] TRUE
> system.time(ans2 <- f2(L))
user system elapsed
0.301 0.000 0.301
> identical(ans1, ans2)
[1] TRUE
A tweak is to update odd values to 3 * x[i] + 1 and then do the division by two unconditionally
x[i] <- 3 * x[i] + 1
count[idx[i]] <- count[idx[i]] + 1L
x <- x / 2
count[idx] <- count[idx] + 1
With this as f3 (not sure why f2 is slower this morning!) I get
> system.time(ans2 <- f2(L))
user system elapsed
0.36 0.00 0.36
> system.time(ans3 <- f3(L))
user system elapsed
0.201 0.003 0.206
> identical(ans2, ans3)
[1] TRUE
It seems like larger steps can be taken at the divide-by-two stage, e.g., 8 is 2^3 so we could take 3 steps (add 3 to count) and be finished, 20 is 2^2 * 5 so we could take two steps and enter the next iteration at 5. Implementations?
Because you need to iterate on values of x you can't really vectorize this. At some point, R has to work on each value of x separately and in turn. You might be able to run the computations on separate CPU cores to speed things up, perhaps using foreach in the package of the same name.
Otherwise, (and this is just hiding the loop from you), wrap the main body of your loop as a function, e.g.:
wonderous <- function(n) {
count <- 0
while(n > 1) {
if(isTRUE(all.equal(n %% 2, 0))) {
n <- n / 2
} else {
n <- (3*n) + 1
}
count <- count + 1
}
return(count)
}
and then you can use sapply() to run the function on a set of numbers:
> sapply(1:50, wonderous)
[1] 0 1 7 2 5 8 16 3 19 6 14 9 9 17 17
[16] 4 12 20 20 7 7 15 15 10 23 10 111 18 18 18
[31] 106 5 26 13 13 21 21 21 34 8 109 8 29 16 16
[46] 16 104 11 24 24
Or you can use Vectorize to return a vectorized version of wonderous which is itself a function that hides even more of this from you:
> wonderousV <- Vectorize(wonderous)
> wonderousV(1:50)
[1] 0 1 7 2 5 8 16 3 19 6 14 9 9 17 17
[16] 4 12 20 20 7 7 15 15 10 23 10 111 18 18 18
[31] 106 5 26 13 13 21 21 21 34 8 109 8 29 16 16
[46] 16 104 11 24 24
I think that is about as far as you can get with standard R tools at the moment.@Martin Morgan shows you can do a lot better than this with an ingenious take on solving the problem that does used R's vectorised abilities.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With