Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_numpy() results in an array of lists. How do I get a 2D numpy array from this?

I have a column of lists, and each list contains the same number of values.

If I do

df['column'].to_numpy()

I get an array of lists:

array([list([0, 4688, 11, 43486, 40508, 13, 5,...
       list([0, 40928, 17707, 22705, 9, 38312, 2..
       list([0, 6766, 368, 3551, 28837,..
      dtype=object)

How do I get a 2D array instead?

like image 463
SantoshGupta7 Avatar asked Aug 31 '25 02:08

SantoshGupta7


2 Answers

You can do this:

np.array(df['column'].tolist())

Or you can simply stack them:

np.stack(df['column'].to_numpy())

This will stack your lists on top of each other and output is a 2-D array. You have to make sure lists are of the same length. Numpy arrays are rectangular.

like image 80
Ehsan Avatar answered Sep 02 '25 15:09

Ehsan


You can use list list comprehension

>>> df
                 A
0  [0, 1, 2, 3, 4]
1  [0, 1, 2, 3, 4]
2  [0, 1, 2, 3, 4]

>>> np.array([x for x in df['A']])
array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])
like image 35
Dishin H Goyani Avatar answered Sep 02 '25 16:09

Dishin H Goyani