Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make numpy matrix with insufficient length of data

I have some data, say a list of 10 numbers and I have to convert that list to a matrix of shape (3,4). What would be the best way to do so, if I say I wanted the data to fill by columns/rows and the unfilled spots to have some default value like -1.

Eg:

data = [0,4,1,3,2,5,9,6,7,8]
>>> output
array([[ 0,  4,  1,  3],
       [ 2,  5,  9,  6],
       [ 7,  8, -1, -1]])

What I thought of doing is

data += [-1]*(row*col - len(data))
output = np.array(data).reshape((row, col))

Is there a simpler method that allows me to achieve the same result without having to modify the original data or sending in data + [-1]*remaining to the np.array function?

like image 973
Asish M. Avatar asked Dec 03 '25 21:12

Asish M.


2 Answers

I'm sure there are various ways of doing this. My first inclination is to make a output array filled with the 'fill', and copy the data to it. Since the fill is 'ragged', not a full column or row, I'd start out 1d and reshape to the final shape.

In [730]: row,col = 3,4
In [731]: data = [0,4,1,3,2,5,9,6,7,8]
In [732]: output=np.zeros(row*col,dtype=int)-1
In [733]: output[:len(data)]=data
In [734]: output = output.reshape(3,4)
In [735]: output
Out[735]: 
array([[ 0,  4,  1,  3],
       [ 2,  5,  9,  6],
       [ 7,  8, -1, -1]])

Regardless of whether data starts as a list or a 1d array, it will have to be copied to output. With a change in the total number of characters we can't just reshape it.

This isn't that different from your approach of adding the extra values via [-1]*n.

There is a pad function, but it works on whole columns or rows, and internally is quite complex because it's written for general cases.

like image 181
hpaulj Avatar answered Dec 06 '25 10:12

hpaulj


Use np.ndarray.flat to index into the flattened version of the array.

data = [0, 4, 1, 3, 2, 5, 9, 6, 7, 8]
default_value = -1
desired_shape = (3, 4)
output = default_value * np.ones(desired_shape)
output.flat[:len(data)] = data

# output is now:
# array([[ 0.,  4.,  1.,  3.],
#       [ 2.,  5.,  9.,  6.],
#       [ 7.,  8., -1., -1.]])

As hpaulj says, the extra copy is really hard to avoid.

If you are reading data from a file somehow, you could read it into the flattened array directly, either using flat, or by reshaping the array afterward. Then the data gets directly loaded into the array with the desired shape.

like image 33
Praveen Avatar answered Dec 06 '25 11:12

Praveen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!