Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - fill NaN based on the previous value of another cell

Tags:

python

pandas

I have some stocks data in a dataframe that I'm resampling, which results in some NaN values. Here's a section of the raw feed:

In [34]: feeddf
Out[34]:
                      open   high    low  close  volume
date
2017-12-03 07:00:00  14.46  14.46  14.46  14.46   25000
2017-12-03 07:01:00  14.46  14.46  14.46  14.46   20917
2017-12-03 07:06:00  14.50  14.50  14.50  14.50    2000
2017-12-03 07:12:00  14.50  14.56  14.50  14.56   17000

The feed is supposed to be minute-by-minute, but when there's not data available, the row is skipped. When resampling the dataframe and aggregating for the opens, highs, lows, and closes, it looks like this:

In [35]: feeddf.resample('3Min').agg({'open': 'first', 
                                      'high': 'max',  
                                      'low': 'min',  
                                      'close': 'last'})
Out[35]:
                      open   high    low  close
date
2017-12-03 07:00:00  14.46  14.46  14.46  14.46
2017-12-03 07:03:00    NaN    NaN    NaN    NaN
2017-12-03 07:06:00  14.50  14.50  14.50  14.50
2017-12-03 07:09:00    NaN    NaN    NaN    NaN
2017-12-03 07:12:00  14.50  14.56  14.50  14.56

My question: I want to forward-fill the missing data based on the last row's close value. df.fillna(method='ffill') is not helping because it fills it based on the last value on the same column. Any idea?

like image 630
Abdul Avatar asked Sep 07 '25 02:09

Abdul


1 Answers

First forward fill last column close and then bfill by columns:

print (df)
                      open   high    low  close
date                                           
2017-12-03 07:00:00  14.46  14.46  14.46  14.81
2017-12-03 07:03:00    NaN    NaN    NaN    NaN
2017-12-03 07:06:00  14.50  14.50  14.50  14.59
2017-12-03 07:09:00    NaN    NaN    NaN    NaN
2017-12-03 07:12:00  14.50  14.56  14.50  14.56


df['close'] = df['close'].ffill()
df = df.bfill(axis=1)
print (df)
                      open   high    low  close
date                                           
2017-12-03 07:00:00  14.46  14.46  14.46  14.81
2017-12-03 07:03:00  14.81  14.81  14.81  14.81
2017-12-03 07:06:00  14.50  14.50  14.50  14.59
2017-12-03 07:09:00  14.59  14.59  14.59  14.59
2017-12-03 07:12:00  14.50  14.56  14.50  14.56
like image 186
jezrael Avatar answered Sep 10 '25 06:09

jezrael