This question is different from the other ones because in none of them the column name resides in the value of a key... Please look at the examples given before marking as duplicate.
I have a df like so:
df: col1 col2 col3
100 200 [{'attribute': 'Pattern', 'value': 'Printed'},...
Closer look at column 3 looks like:
[{'attribute': 'Pattern', 'value': 'Printed'},
{'attribute': 'Topwear style', 'value': 'T shirt'},
{'attribute': 'Bottomwear Length', 'value': 'Short'},
{'attribute': 'Colour Palette', 'value': 'Bright colours'},
{'attribute': 'Bottomwear style', 'value': 'Baggy'},
{'attribute': 'Topwear length', 'value': 'Waist'},
{'attribute': 'Sleeve style', 'value': 'Sleeveless'},
{'attribute': 'Type of pattern', 'value': 'Graphic print'},
{'attribute': 'Neck', 'value': 'Round'},
{'attribute': 'Level of embellishment', 'value': 'No'}]
Where each attribute is column name and each value, is the value for that column name.
The output will look something like this:
df: col1 col2 Pattern Topwear Style Bottomwear Length ....
100 200 Printed T shirt Shorts
There are multiple rows with repeating and new attributes and values. How would I go about doing this in pandas? I tried searching for something similar but couldn't find anything useful.
Try with:
df=df.join(pd.concat([pd.DataFrame(v).set_index('attribute').T
for v in df.pop('col3')]).reset_index(drop=True))
Setup:
d=[{'attribute': 'Pattern', 'value': 'Printed'},
{'attribute': 'Topwear style', 'value': 'T shirt'},
{'attribute': 'Bottomwear Length', 'value': 'Short'},
{'attribute': 'Colour Palette', 'value': 'Bright colours'},
{'attribute': 'Bottomwear style', 'value': 'Baggy'},
{'attribute': 'Topwear length', 'value': 'Waist'},
{'attribute': 'Sleeve style', 'value': 'Sleeveless'},
{'attribute': 'Type of pattern', 'value': 'Graphic print'},
{'attribute': 'Neck', 'value': 'Round'},
{'attribute': 'Level of embellishment', 'value': 'No'}]
df=pd.DataFrame({'a':100,'b':200,'col3':[d]},index=[0])
Output:

x = df['col3'].tolist()
newcol = {item['attribute'] : [item['value']] for item in x }
newdf = pd.DataFrame(newcol)
del df['col3']
print(df.join(newdf, how='right'))
Output
col1 col2 Pattern Topwear style Bottomwear Length Colour Palette \
0 100 200 Printed T shirt Short Bright colours
...
dataframe for test.
data = {'col1':100, 'col2': 200, 'col3': [{'attribute': 'Pattern', 'value': 'Printed'},
{'attribute': 'Topwear style', 'value': 'T shirt'},
{'attribute': 'Bottomwear Length', 'value': 'Short'},
{'attribute': 'Colour Palette', 'value': 'Bright colours'},
{'attribute': 'Bottomwear style', 'value': 'Baggy'},
{'attribute': 'Topwear length', 'value': 'Waist'},
{'attribute': 'Sleeve style', 'value': 'Sleeveless'},
{'attribute': 'Type of pattern', 'value': 'Graphic print'},
{'attribute': 'Neck', 'value': 'Round'},
{'attribute': 'Level of embellishment', 'value': 'No'}]}
df = pd.DataFrame(data)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With