Convert a column with a list of dicts, where column name and value are both present as values inside dict keys

Question

This question is different from the other ones because in none of them the column name resides in the value of a key... Please look at the examples given before marking as duplicate.

I have a df like so:

df: col1 col2 col3
    100  200  [{'attribute': 'Pattern', 'value': 'Printed'},...

Closer look at column 3 looks like:

[{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]

Where each attribute is column name and each value, is the value for that column name.

The output will look something like this:

df: col1   col2    Pattern       Topwear Style       Bottomwear Length ....
    100    200     Printed       T shirt             Shorts

There are multiple rows with repeating and new attributes and values. How would I go about doing this in pandas? I tried searching for something similar but couldn't find anything useful.

anky · Accepted Answer

Try with:

df=df.join(pd.concat([pd.DataFrame(v).set_index('attribute').T 
               for v in df.pop('col3')]).reset_index(drop=True))

Setup:

d=[{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]
df=pd.DataFrame({'a':100,'b':200,'col3':[d]},index=[0])

Output:

enter image description here

ComplicatedPhenomenon · Answer

x = df['col3'].tolist()
newcol = {item['attribute'] : [item['value']] for item in x }
newdf = pd.DataFrame(newcol)
del df['col3'] 
print(df.join(newdf, how='right'))

Output

   col1  col2  Pattern Topwear style Bottomwear Length  Colour Palette  \
0   100   200  Printed       T shirt             Short  Bright colours  
...

dataframe for test.

data = {'col1':100, 'col2': 200, 'col3': [{'attribute': 'Pattern', 'value': 'Printed'},
 {'attribute': 'Topwear style', 'value': 'T shirt'},
 {'attribute': 'Bottomwear Length', 'value': 'Short'},
 {'attribute': 'Colour Palette', 'value': 'Bright colours'},
 {'attribute': 'Bottomwear style', 'value': 'Baggy'},
 {'attribute': 'Topwear length', 'value': 'Waist'},
 {'attribute': 'Sleeve style', 'value': 'Sleeveless'},
 {'attribute': 'Type of pattern', 'value': 'Graphic print'},
 {'attribute': 'Neck', 'value': 'Round'},
 {'attribute': 'Level of embellishment', 'value': 'No'}]}

df = pd.DataFrame(data)

Convert a column with a list of dicts, where column name and value are both present as values inside dict keys

Tags:

python

dictionary

list

pandas

piyush daga

2 Answers

anky

ComplicatedPhenomenon

Recent Activity

Donate For Us

Convert a column with a list of dicts, where column name and value are both present as values inside dict keys

Tags:

python

dictionary

list

pandas

piyush daga

2 Answers

anky

ComplicatedPhenomenon

Related questions

Recent Activity

Donate For Us