I am looking for any advice on how to cleanly convert a python multi level nested dictionary (from JSON) into a data frame boolean table.
Rules:
Example Input:
{1:{'group_a':{'bool_a':True,
'bool_b':True,
'bool_n':True},
'group_n':{'bool_b':True,
'bool_n':True}
},
2:{'group_a':{'bool_a':True,
'bool_b':True,
'bool_n':True},
'group_n':{'bool_b':True,
'bool_n':True}
},
'n':{'group_a':{'bool_a':True,
'bool_c':True},
'group_n':{'bool_b':True}
},
}
Desired Output:
Ga_Ba, Ga_Bb, Ga_Bc, Ga_Bn, Gn_Ba, Gn_Bb, ... Gn_Bn....
1 True True False True False True True
2 True True False True False True True
n True False True False False False False
...
Ideas? Bonus points for speed and conciseness. I have a solution but I am looking for something more elegant than the for loop mess I have now. Alternative data structures may also be welcome.
s = pd.DataFrame.from_dict(data, orient='index').stack()
pd.json_normalize(s).set_index(s.index) \
.stack().unstack([1, 2], fill_value=False) \
.sort_index(axis=1)
group_a group_n
bool_a bool_b bool_c bool_n bool_b bool_n
1 True True False True True True
2 True True False True True True
3 True False True False True False
pd.DataFrame.from_dict({
k0: {
f'G{k1.split("_")[1]}_B{k2.split("_")[1]}': val
for k1, d1 in d0.items()
for k2, val in d1.items()
}
for k0, d0 in data.items()
}, orient='index').fillna(False)
Ga_Ba Ga_Bb Ga_Bn Gn_Bb Gn_Bn Ga_Bc
1 True True True True True False
2 True True True True True False
3 True False False True False True
You could use a dictionary comprehension and concat
:
import pandas as pd
values = {
"1": {
"group_a": {"bool_a": True, "bool_b": True, "bool_n": True},
"group_n": {"bool_b": True, "bool_n": True},
},
"2": {
"group_a": {"bool_a": True, "bool_b": True, "bool_n": True},
"group_n": {"bool_b": True, "bool_n": True},
},
"n": {"group_a": {"bool_a": True, "bool_c": True}, "group_n": {"bool_b": True}},
}
stacked_values = {k: pd.DataFrame(v).stack() for k, v in values.items()}
df = (
pd.concat(stacked_values, axis=1)
.T.fillna(False)
.swaplevel(axis=1) # optional
.sort_index(axis=1)
)
Output:
group_a group_n
bool_a bool_b bool_c bool_n bool_b bool_n
1 True True False True True True
2 True True False True True True
n True False True False True False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With