I have a DataFrame that has nulls within a given column, within the same index, there is another column with repeating non Null values. What I am trying to figure out is what's the proper way of filling those null values using the ID column as reference using Pandas native functions.
Thank you for your help.
Original:
Company ID
AAA 100
BBB 200
CCC 150
**NULL 100
FFF 375
**NULL 150
Formatted:
AAA 100
BBB 200
CCC 150
**AAA 100
FFF 375
**CCC 150
You can try:
df['Company'] = df.groupby('ID')['Company'].transform('first')
As commented, the above will replace all Company not just those with nan. So it may give wrong result if you have several Company for an ID. Instead, you can do:
df['Company'] = df['Company'].fillna(df.groupby('ID')['Company'].transform('first'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With