Having the follow dataframe
df = pd.DataFrame([[30, 20, {'some_data': 30}]], columns=['a', 'b', 'c'])
I would like to create a new column with the value of some_data value.
I was thinking something like:
df['new_column'] = df['c']['some_data']
Is there a simple way to do it? In reality the dict would be more complex, I will have to get a nested value.
EDIT 1:
Here is a example where I have nested data, it's closer to real problem.
df = pd.DataFrame([[30, 20, {'some_data': [{'other_data': 0}]}]], columns=['a', 'b', 'c'])
# I would like to do something like:
df['new_column'] = df['c']['some_data'][0]['other_data'] 
Use the .str accessor:
df.c.str['some_data']
#0    30
#Name: c, dtype: int64
You can further chain .str for nested data access, given:
df = pd.DataFrame([[30, 20, {'some_data': [{'other_data': 0}]}]], columns=['a', 'b', 'c'])
df
#    a   b                                   c
#0  30  20  {'some_data': [{'other_data': 0}]}
To access nested other_data field, you can do:
df.c.str['some_data'].str[0].str['other_data']
#0    0
#Name: c, dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With