Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a column with a list of dicts, where column name and value are both present as values inside dict keys

This question is different from the other ones because in none of them the column name resides in the value of a key... Please look at the examples given before marking as duplicate.

I have a df like so:

df: col1 col2 col3
    100  200  [{'attribute': 'Pattern', 'value': 'Printed'},...

Closer look at column 3 looks like:

[{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]

Where each attribute is column name and each value, is the value for that column name.

The output will look something like this:

df: col1   col2    Pattern       Topwear Style       Bottomwear Length ....
    100    200     Printed       T shirt             Shorts

There are multiple rows with repeating and new attributes and values. How would I go about doing this in pandas? I tried searching for something similar but couldn't find anything useful.

like image 839
piyush daga Avatar asked Dec 15 '25 18:12

piyush daga


2 Answers

Try with:

df=df.join(pd.concat([pd.DataFrame(v).set_index('attribute').T 
               for v in df.pop('col3')]).reset_index(drop=True))

Setup:

d=[{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]
df=pd.DataFrame({'a':100,'b':200,'col3':[d]},index=[0])

Output:

enter image description here

like image 153
anky Avatar answered Dec 17 '25 08:12

anky


x = df['col3'].tolist()
newcol = {item['attribute'] : [item['value']] for item in x }
newdf = pd.DataFrame(newcol)
del df['col3'] 
print(df.join(newdf, how='right'))

Output

   col1  col2  Pattern Topwear style Bottomwear Length  Colour Palette  \
0   100   200  Printed       T shirt             Short  Bright colours  
... 

dataframe for test.

data = {'col1':100, 'col2': 200, 'col3': [{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]}

df = pd.DataFrame(data)
like image 30
ComplicatedPhenomenon Avatar answered Dec 17 '25 08:12

ComplicatedPhenomenon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!