Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why NaN in pivot table?

I've removed all NaN from a df using df = df.fillna(0).

After I create a pivot table using

pd.pivot_table(df, index='Source', columns='Customer Location', values='Total billed £')

I still get NaN data again as output.

Could someone explain me why and how to prevent this output and why this is happening?


1 Answers

Because of your input data, it converts one column to index and the values of another one to columns. The intersection of these are the aggregated values. But if some combinations do not exist in the input data, these will result into missing data (NaN).

df = pd.DataFrame({
        'Source':list('abcdef'),
         'Total billed £':[5,3,6,9,2,4],
         'Customer Location':list('adfbbb')
})

print (df)
  Source  Total billed £ Customer Location
0      a               5                 a
1      b               3                 d
2      c               6                 f
3      d               9                 b
4      e               2                 b
5      f               4                 b

#e.g because `Source=a` and `Customer Location=b` not exist in source then NaN in output
print (pd.pivot_table(df,index='Source', columns='Customer Location',values='Total billed £'))
Customer Location    a    b    d    f
Source                               
a                  5.0  NaN  NaN  NaN
b                  NaN  NaN  3.0  NaN
c                  NaN  NaN  NaN  6.0
d                  NaN  9.0  NaN  NaN
e                  NaN  2.0  NaN  NaN
f                  NaN  4.0  NaN  NaN

Furthermore, here's a good read on reshaping data.

like image 191
jezrael Avatar answered Sep 10 '25 05:09

jezrael