I have different sized vectors and want to do element-wise manipulations. How can I optimize the following for-loop in Python? (For instance with np.vectorize())
import numpy as np
n = 1000000
vec1 = np.random.rand(n)
vec2 = np.random.rand(3*n)
vec3 = np.random.rand(3*n)
for i in range(len(vec1)):
if vec1[i] < 0.5:
vec2[3*i : 3*(i+1)] = vec1[i]*vec3[3*i : 3*(i+1)]
else:
vec2[3*i : 3*(i+1)] = [0,0,0]
Thanks a lot for your help.
We could leverage broadcasting -
v = vec3.reshape(-1,3)*vec1[:,None]
m = vec1<0.5
vec2_out = (v*m[:,None]).ravel()
Another way to express that would be -
mask = vec1<0.5
vec2_out = (vec3.reshape(-1,3)*(vec1*mask)[:,None]).ravel()
And use multi-cores with numexpr module -
import numexpr as ne
d = {'V3r':vec3.reshape(-1,3),'vec12D':vec1[:,None]}
out = ne.evaluate('V3r*vec12D*(vec12D<0.5)',d).ravel()
Timings -
In [84]: n = 1000000
...: np.random.seed(0)
...: vec1 = np.random.rand(n)
...: vec2 = np.random.rand(3*n)
...: vec3 = np.random.rand(3*n)
In [86]: %%timeit
...: v = vec3.reshape(-1,3)*vec1[:,None]
...: m = vec1<0.5
...: vec2_out = (v*m[:,None]).ravel()
10 loops, best of 3: 23.2 ms per loop
In [87]: %%timeit
...: mask = vec1<0.5
...: vec2_out = (vec3.reshape(-1,3)*(vec1*mask)[:,None]).ravel()
100 loops, best of 3: 13.1 ms per loop
In [88]: %%timeit
...: d = {'V3r':vec3.reshape(-1,3),'vec12D':vec1[:,None]}
...: out = ne.evaluate('V3r*vec12D*(vec12D<0.5)',d).ravel()
100 loops, best of 3: 4.11 ms per loop
For a generic case, where the else-part could be something other than zeros, it would be -
mask = vec1<0.5
IF_vals = vec3.reshape(-1,3)*vec1[:,None]
ELSE_vals = np.array([1,1,1])
out = np.where(mask[:,None],IF_vals,ELSE_vals).ravel()
numpy.vectorize, as mentioned in the comments, is for convenience, not performance, per the docs:
The
vectorizefunction is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
One solution to actually vectorize this would be:
vec2[:] = vec1.repeat(3) * vec3 # Bulk compute all results
vec2[(vec1 < 0.5).repeat(3)] = 0 # Zero the results you meant to exclude
Another approach (that minimizes temporaries) would be to filter and reshape vec1 so it can be assigned to vec2, then multiply vec2 by vec3 in place to avoid a temporary (beyond the two n length arrays from the first step), e.g.:
vec2.reshape(-1, 3)[:] = (vec1 * (vec1 >= 0.5)).reshape(-1, 1)
vec2 *= vec3
An additional temporary could be shaved if vec1 can be modified, simplifying to:
vec1 *= vec1 >= 0.5
vec2.reshape(-1, 3)[:] = vec1.reshape(-1, 1)
vec2 *= vec3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With