Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't we use a fill_value when reshaping a dataframe (array)?

I have this dataframe :

df = pd.DataFrame([list("ABCDEFGHIJ")])
​
   0  1  2  3  4  5  6  7  8  9
0  A  B  C  D  E  F  G  H  I  J

I got an error when trying to reshape the dataframe/array :

np.reshape(df, (-1, 3))

ValueError: cannot reshape array of size 10 into shape (3)

I'm expecting this array (or a dataframe with the same shape) :

array([['A', 'B', 'C'],
       ['D', 'E', 'F'],
       ['G', 'H', 'I'],
       ['J', nan, nan]], dtype=object)

Why NumPy can't guess the expected shape by completing the missing values with nan?

like image 206
VERBOSE Avatar asked Dec 06 '25 07:12

VERBOSE


1 Answers

Another possible solution, based on numpy.pad, which inserts the needed np.nan into the array:

n = 3
s = df.shape[1]
m = s // n + 1*(s % n != 0)
np.pad(df.values.flatten(), (0, m*n - s), 
       mode='constant', constant_values=np.nan).reshape(m,n)

Explanation:

  • s // n is the integer division of the length of the original array and the number of columns (after reshape).

  • s % n gives the remainder of the division s // n. For instance, if s = 9, then s // n is equal to 3 and s % n equal to 0.

  • However, if s = 10, s // n is equal to 3 and s % n equal to 1. Thus, s % n != 0 is True. Consequently, 1*(s % n != 0) is equal to 1, which makes m = 3 + 1.

  • (0, m*n - s) means the number of np.nan to insert at the left of the array (0, in this case) and the number of np.nan to insert at the right of the array (m*n - s).

Output:

array([['A', 'B', 'C'],
       ['D', 'E', 'F'],
       ['G', 'H', 'I'],
       ['J', nan, nan]], dtype=object)
like image 191
PaulS Avatar answered Dec 08 '25 21:12

PaulS



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!