Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame: how to fill `nan` with 0s but only nans existing between valid values?

What I'd like to do:

In [2]: b = pd.DataFrame({"a": [np.nan, 1, np.nan, 2, np.nan]})
Out[2]:
      a
0   nan
1 1.000
2   nan
3 2.000
4   nan

Expected output:

      a
0   nan
1 1.000
2   0
3 2.000
4   nan

As you can see here, only nans that are surrounded by valid values are replaced with 0.

How can I do this?

  • df.interpolate(limit_area='inside') looks good to me but it doesn't have an argument to fill with 0s...
like image 927
user3595632 Avatar asked Sep 06 '25 03:09

user3595632


1 Answers

Method 1: interpolate, isna, notna and loc

You can use interpolate and then check which positions have NaN in your original data, and which are filled in your interpolated, then replace those values with 0:

s = df['a'].interpolate(limit_area='inside')

m1 = b['a'].isna()
m2 = s.notna()

df.loc[m1&m2, 'a'] = 0

     a
0  NaN
1  1.0
2  0.0
3  2.0
4  NaN

Method 2: shift and loc:

An easier method would be to check if previous row and next row are not NaN and fill those positions with 0:

m1 = df['a'].shift().notna()
m2 = df['a'].shift(-1).notna()
m3 = df['a'].isna()

df.loc[m1&m2&m3, 'a'] = 0

     a
0  NaN
1  1.0
2  0.0
3  2.0
4  NaN
like image 67
Erfan Avatar answered Sep 07 '25 20:09

Erfan