I have a data frame that looks like this:
a = {'price': [1, 2],
'nested_column':
[[{'key': 'code', 'value': 'A', 'label': 'rif1'},
{'key': 'datemod', 'value': '31/09/2022', 'label': 'mod'}],
[{'key': 'code', 'value': 'B', 'label': 'rif2'},
{'key': 'datemod', 'value': '31/08/2022', 'label': 'mod'}]]}
df = pd.DataFrame(data=a)
My expected output should look like this:
b = {'price': [1, 2],
'code':["A","B"],
'datemod':["31/09/2022","31/08/2022"]}
exp_df = pd.DataFrame(data=b)
I tried some lines of code, that unfortunately don't do the job, that look like this:
df = pd.concat([df.drop(['nested_column'], axis=1), df['nested_column'].apply(pd.Series)], axis=1)
df = pd.concat([df.drop([0], axis=1), df[0].apply(pd.Series)], axis=1)
You can pop and explode your column to feed to json_normalize, then pivot according to the desired key/value and join:
# pop the json column and explode to rows
s = df.pop('nested_column').explode()
df = df.join(pd.json_normalize(s) # normalize dictionary to columns
.assign(idx=s.index) # ensure same index
.pivot(index='idx', columns='key', values='value')
)
output:
price code datemod
0 1 A 31/09/2022
1 2 B 31/08/2022
Get key: value pairs from nested dictionaries and flatten values by json_normalize:
f = lambda x: {y['key']:y['value'] for y in x for k, v in y.items()}
df['nested_column'] = df['nested_column'].apply(f)
print (df)
price nested_column
0 1 {'code': 'A', 'datemod': '31/09/2022'}
1 2 {'code': 'B', 'datemod': '31/08/2022'}
df1 = df.join(pd.json_normalize(df.pop('nested_column')))
print (df1)
price code datemod
0 1 A 31/09/2022
1 2 B 31/08/2022
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With