I have csv data looks like this:
A B
0 x aa
1 z aa
2
3
4 x aa
5 z bb
6 x bb
7
8 z cc
I would like to fill the empty cells in B column with values in A, if in A column
last_available_value_before_the_NaNs_in_A.split()[-1] == next_available_value_after_the_NaNs_in_A.split()[-1]
the wanted result would be:
A B
0 x aa aa
1 z aa aa
2 aa
3 aa
4 x aa aa
5 z bb bb
6 x bb bb
7
8 z cc cc
data.loc(7,'B') will be NaN because data.loc(6,'A').split()[-1] ==
data.loc(8,'A').split()[-1] is false.
data.loc(5,'B') is 'bb' because data.loc(5,'A').split()[-1] == 'bb'
Thanks for your help!
You could compare a version that uses ffill and one that uses bfill:
f = df.A.fillna(method='ffill').str.split().str[-1]
b = df.A.fillna(method='bfill').str.split().str[-1]
df.B.where(f != b, f, inplace=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With