Let's suppose I have the following DataFrame:
import pandas as pd
df = pd.DataFrame({'label': ['a', 'a', 'b', 'b', 'a', 'b', 'c', 'c', 'a', 'a'],
'numbers': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
'arbitrarydata': [False] * 10})
I want to assign a value to the arbitrarydata
column according to the values in both of the other colums. A naive approach would be as follows:
for _, grp in df.groupby(('label', 'numbers')):
grp.arbitrarydata = pd.np.random.rand()
Naturally, this doesn't propagate changes back to df
. Is there a way to modify a group such that changes are reflected in the original DataFrame ?
Try using transform
, e.g.:
df['arbitrarydata'] = df.groupby(('label', 'numbers')).transform(lambda x: np.random.rand())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With