How to delete nan/null values in lists in a list in Python?

Question

So I have a dataframe with NaN values and I tranfsform all the rows in that dataframe in a list which then is added to another list.

Index   1   2   3   4   5   6   7   8   9   10  ... 71  72  73  74  75  76  77  78  79  80
orderid                                                                                 
20000765    624380  nan nan nan nan nan nan nan nan nan ... nan nan nan nan nan nan nan nan nan nan
20000766    624380  nan nan nan nan nan nan nan nan nan ... nan nan nan nan nan nan nan nan nan nan
20000768    1305984 1305985 1305983 1306021 nan nan nan nan nan nan ... nan nan nan nan nan nan nan nan nan nan

records = []
for i in range(0, 60550):
    records.append([str(dfpivot.values[i,j]) for j in range(0, 10)])

However, a lot of rows contain NaN values which I want to delete from the list, before I put it in the list of lists. Where do I need to insert that code and how do I do this?

I thought that this code would do the trick, but I guess it looks only to the direct values in the 'list of lists':

records = [x for x in records if str(x) != 'nan']

I'm new to Python, so I'm still figuring out the basics.

yatu · Accepted Answer

One way is to take advantage of the fact that stack removes NaNs to generate the nested list:

df.stack().groupby(level=0).apply(list).values.tolist()
# [[624380.0], [624380.0], [1305984.0, 1305985.0, 1305983.0, 1306021.0]]

oppressionslayer · Answer

IF you want to keep rows with nans you can do it like this:

In [5457]: df.T.dropna(how='all').T                                                                                                                                                            
Out[5457]: 
         Index           1           2           3           4
0 20000765.000  624380.000         nan         nan         nan
1 20000766.000  624380.000         nan         nan         nan
2 20000768.000 1305984.000 1305985.000 1305983.000 1306021.000

if you don't want any columns with nans you can drop them like this:

In [5458]: df.T.dropna().T                                                                                                                                                                     
Out[5458]: 
         Index           1
0 20000765.000  624380.000
1 20000766.000  624380.000
2 20000768.000 1305984.000

To create the array:

In [5464]: df.T.apply(lambda x: x.dropna().tolist()).tolist()                                                                                                                                  
Out[5464]: 
[[20000765.0, 624380.0],
 [20000766.0, 624380.0],
 [20000768.0, 1305984.0, 1305985.0, 1305983.0, 1306021.0]]

or

df.T[1:].apply(lambda x: x.dropna().tolist()).tolist()                                                                                                                              

Out[5471]: [[624380.0], [624380.0], [1305984.0, 1305985.0, 1305983.0, 1306021.0]]

depending on how you want the array

How to delete nan/null values in lists in a list in Python?

Tags:

python

pandas

Tim Hellegers

2 Answers

yatu

oppressionslayer

Recent Activity

Donate For Us

How to delete nan/null values in lists in a list in Python?

Tags:

python

pandas

Tim Hellegers

2 Answers

yatu

oppressionslayer

Related questions

Recent Activity

Donate For Us