Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace <NA> values with np.nan -- file imported using Pandas read_pickle()

I have created a Pandas Data frame by importing a pickle ('P') file using pd.read_pickle(). Below is the head info of the data frame. It looks like the <NA> values are created where there is no data. I want convert these <NA> values into np.nan.

sequels = pd.read_pickle('D:\Learning\Datacamp\Datasets/sequels.p')
print(sequels.head())
      id         title  sequel
0  19995        Avatar    <NA>
1    862     Toy Story     863
2    863   Toy Story 2   10193
3    597       Titanic    <NA>
4  24428  The Avengers    <NA>

I have tried using a few methods - sequels.replace('<NA>', np.nan), sequels.fillna(np.nan) and using regex - sequels.replace(r'^\s*$', np.nan, regex=True).

In all these cases, the values are not getting replaced. Any suggestions?

like image 503
Srinivas Avatar asked Sep 06 '25 03:09

Srinivas


1 Answers

If replace missing values NaN to floats get np.nan, because in original column is used integer na:

df['sequel'] = df['sequel'].astype('float')
print (df)
      id         title   sequel
0  19995        Avatar      NaN
1    862     Toy Story    863.0
2    863   Toy Story 2  10193.0
3    597       Titanic      NaN
4  24428  The Avengers      NaN

Solution with replace:

df['sequel'] = df['sequel'].replace({pd.NA: np.nan})

print (df)
      id         title   sequel
0  19995        Avatar      NaN
1    862     Toy Story    863.0
2    863   Toy Story 2  10193.0
3    597       Titanic      NaN
4  24428  The Avengers      NaN

Or:

 df['sequel'].replace({pd.NA: np.nan}, inplace=True)
like image 151
jezrael Avatar answered Sep 07 '25 22:09

jezrael