I want to add the previous n rows as columns to a NumPy array.
For example, if n=2, the array below...
[[ 1, 2]
[ 3, 4]
[ 5, 6]
[ 7, 8]
[ 9, 10]
[11, 12]]
...should be turned into the following one:
[[ 1, 2, 0, 0, 0, 0]
[ 3, 4, 1, 2, 0, 0]
[ 5, 6, 3, 4, 1, 2]
[ 7, 8, 5, 6, 3, 4]
[ 9, 10, 7, 8, 5, 6]
[11, 12, 9, 10, 7, 8]]
Any ideas how I could do that without going over the entire array in a for loop?
Here's a vectorized approach -
def vectorized_app(a,n):
M,N = a.shape
idx = np.arange(a.shape[0])[:,None] - np.arange(n+1)
out = a[idx.ravel(),:].reshape(-1,N*(n+1))
out[N*(np.arange(1,M+1))[:,None] <= np.arange(N*(n+1))] = 0
return out
Sample run -
In [255]: a
Out[255]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15],
[16, 17, 18]])
In [256]: vectorized_app(a,3)
Out[256]:
array([[ 1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 4, 5, 6, 1, 2, 3, 0, 0, 0, 0, 0, 0],
[ 7, 8, 9, 4, 5, 6, 1, 2, 3, 0, 0, 0],
[10, 11, 12, 7, 8, 9, 4, 5, 6, 1, 2, 3],
[13, 14, 15, 10, 11, 12, 7, 8, 9, 4, 5, 6],
[16, 17, 18, 13, 14, 15, 10, 11, 12, 7, 8, 9]])
Runtime test -
I am timing @Psidom's loop-comprehension based method and the vectorized method listed in this post on a 100x scaled up version (in terms of size) of the sample posted in the question :
In [246]: a = np.random.randint(0,9,(600,200))
In [247]: n = 200
In [248]: %timeit np.column_stack(mypad(a, i) for i in range(n + 1))
1 loops, best of 3: 748 ms per loop
In [249]: %timeit vectorized_app(a,n)
1 loops, best of 3: 224 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With