How to vectorize operation with vectors of different size

Question

I have different sized vectors and want to do element-wise manipulations. How can I optimize the following for-loop in Python? (For instance with np.vectorize())

import numpy as np

n = 1000000

vec1 = np.random.rand(n)
vec2 = np.random.rand(3*n)
vec3 = np.random.rand(3*n)

for i in range(len(vec1)):
    if vec1[i] < 0.5:
        vec2[3*i : 3*(i+1)] = vec1[i]*vec3[3*i : 3*(i+1)]
    else:
        vec2[3*i : 3*(i+1)] = [0,0,0]

Thanks a lot for your help.

Divakar · Accepted Answer

We could leverage broadcasting -

v = vec3.reshape(-1,3)*vec1[:,None]
m = vec1<0.5
vec2_out = (v*m[:,None]).ravel()

Another way to express that would be -

mask = vec1<0.5
vec2_out = (vec3.reshape(-1,3)*(vec1*mask)[:,None]).ravel()

And use multi-cores with numexpr module -

import numexpr as ne

d = {'V3r':vec3.reshape(-1,3),'vec12D':vec1[:,None]}
out = ne.evaluate('V3r*vec12D*(vec12D<0.5)',d).ravel()

Timings -

In [84]: n = 1000000
    ...: np.random.seed(0)
    ...: vec1 = np.random.rand(n)
    ...: vec2 = np.random.rand(3*n)
    ...: vec3 = np.random.rand(3*n)

In [86]: %%timeit
    ...: v = vec3.reshape(-1,3)*vec1[:,None]
    ...: m = vec1<0.5
    ...: vec2_out = (v*m[:,None]).ravel()
10 loops, best of 3: 23.2 ms per loop

In [87]: %%timeit
    ...: mask = vec1<0.5
    ...: vec2_out = (vec3.reshape(-1,3)*(vec1*mask)[:,None]).ravel()
100 loops, best of 3: 13.1 ms per loop

In [88]: %%timeit
    ...: d = {'V3r':vec3.reshape(-1,3),'vec12D':vec1[:,None]}
    ...: out = ne.evaluate('V3r*vec12D*(vec12D<0.5)',d).ravel()
100 loops, best of 3: 4.11 ms per loop

For a generic case, where the else-part could be something other than zeros, it would be -

mask = vec1<0.5
IF_vals = vec3.reshape(-1,3)*vec1[:,None]
ELSE_vals = np.array([1,1,1])
out = np.where(mask[:,None],IF_vals,ELSE_vals).ravel()

ShadowRanger · Answer

numpy.vectorize, as mentioned in the comments, is for convenience, not performance, per the docs:

The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.

One solution to actually vectorize this would be:

vec2[:] = vec1.repeat(3) * vec3   # Bulk compute all results
vec2[(vec1 < 0.5).repeat(3)] = 0  # Zero the results you meant to exclude

Another approach (that minimizes temporaries) would be to filter and reshape vec1 so it can be assigned to vec2, then multiply vec2 by vec3 in place to avoid a temporary (beyond the two n length arrays from the first step), e.g.:

vec2.reshape(-1, 3)[:] = (vec1 * (vec1 >= 0.5)).reshape(-1, 1)
vec2 *= vec3

An additional temporary could be shaved if vec1 can be modified, simplifying to:

vec1 *= vec1 >= 0.5
vec2.reshape(-1, 3)[:] = vec1.reshape(-1, 1)
vec2 *= vec3

How to vectorize operation with vectors of different size

Tags:

python

vectorization

numpy

Malte Winckler

2 Answers

Divakar

ShadowRanger

Recent Activity

Donate For Us

How to vectorize operation with vectors of different size

Tags:

python

vectorization

numpy

Malte Winckler

2 Answers

Divakar

ShadowRanger

Related questions

Recent Activity

Donate For Us