I need to create a 2D numpy array from a list of 1D arrays and scalars so that the scalars are replicated to match the length of the 1D arrays.
Example of desired behaviour
>>> x = np.ones(5)
>>> something([x, 0, x])
array([[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.]])
I know that the vectorial elements of the list are always going to have the same length (shape) so I can do it "by hand" by doing something like this:
def something(lst):
for e in lst:
if isinstance(e, np.ndarray):
l = len(e)
break
tmp = []
for e in lst:
if isinstance(e, np.ndarray):
tmp.append(e)
l = len(e)
else:
tmp.append(np.empty(l))
tmp[-1][:] = e
return np.array(tmp)
What I am asking for is whether there is some ready-made solution hidden somewhere in numpy or, if there is none, whether there is a better (e.g. more general, more reliable, faster) solution than the one above.
In [179]: np.column_stack(np.broadcast(x, 0, x))
Out[179]:
array([[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.]])
or
In [187]: np.row_stack(np.broadcast_arrays(x, 0, x))
Out[187]:
array([[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.]])
Using np.broadcast
is faster than np.broadcast_arrays
:
In [195]: %timeit np.column_stack(np.broadcast(*[x, 0, x]*10))
10000 loops, best of 3: 46.4 µs per loop
In [196]: %timeit np.row_stack(np.broadcast_arrays(*[x, 0, x]*10))
1000 loops, best of 3: 380 µs per loop
but slower than your something
function:
In [201]: %timeit something([x, 0, x]*10)
10000 loops, best of 3: 37.3 µs per loop
Note that np.broadcast
can be passed at most 32 arrays:
In [199]: np.column_stack(np.broadcast(*[x, 0, x]*100))
ValueError: Need at least two and fewer than (32) array objects.
whereas np.broadcast_arrays
is unlimited:
In [198]: np.row_stack(np.broadcast_arrays(*[x, 0, x]*100))
Out[198]:
array([[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
...,
[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.]])
Using np.broadcast
or np.broadcast_arrays
is a bit more general than
something
. It will work on arrays of different (but broadcastable) shapes, for
instance:
In [209]: np.column_stack(np.broadcast(*[np.atleast_2d(x), 0, x]))
Out[209]:
array([[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.]])
whereas something([np.atleast_2d(x), 0, x])
returns:
In [211]: something([np.atleast_2d(x), 0, x])
Out[211]:
array([array([[ 1., 1., 1., 1., 1.]]), array([ 0.]),
array([ 1., 1., 1., 1., 1.])], dtype=object)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With