How to convert a list of dictionaries to a Pandas Dataframe with one of the values as column name?

Question

I have a dataframe where a column consists of a list of dictionaries, something like this-

    column1    column2
0   abc        [{key1:value_A, key2:value_1}, {key1:value_B, key2:value_2}, {key1:value_C, key2:value_3},...]
    .
    .
    .
n   xyz        [{key1:value_A, key2:value_4}, {key1:value_B, key2:value_5}, {key1:value_C, key2:value_6},...]

I want to convert this dataframe to something like this-

    column1    value_A    value_B    value_C ....
0   abc        value_1    value_2    value_3
    .
    .
    .
n   xyz        value_4    value_5    value_6

What is a fast and efficient way to do this?

You can use the following code snippet to generate the df -

import pandas as pd
df = pd.DataFrame([[1, [
    {'id': 1144801690551941, 'value': 20},
    {'id': 8202109018383881, 'value': 26},
    {'id': 3025222222235562, 'value': 37},
    {'id': 5834245818862827, 'value': 35},
    {'id': 4689782481420271, 'value': 27},
    {'id': 7385168421196875, 'value': 56},
    ]], [2, [
    {'id': 1144801690551941, 'value': 25},
    {'id': 8202109018383881, 'value': 26},
    {'id': 3025222222235562, 'value': 38},
    {'id': 5834245818862827, 'value': 35},
    {'id': 4689782481420271, 'value': 21},
    {'id': 7385168421196875, 'value': 53},
    ]], [3, [
    {'id': 1144801690551941, 'value': 20},
    {'id': 8202109018383881, 'value': 29},
    {'id': 3025222222235562, 'value': 37},
    {'id': 5834245818862827, 'value': 32},
    {'id': 4689782481420271, 'value': 27},
    {'id': 7385168421196875, 'value': 50},
    ]]], columns=['column1', 'column2'])

Which results to -

   column1  column2
0        1  [{'id': 1144801690551941, 'value': 20}, {'id':...
1        2  [{'id': 1144801690551941, 'value': 25}, {'id':...
2        3  [{'id': 1144801690551941, 'value': 20}, {'id':...

The output I expect-

    column1  1144801690551941  8202109018383881  3025222222235562 ...
0   1        20                26                37
1   2        25                26                38
2   3        20                29                37

Thanks!

Ben.T · Accepted Answer

From the column2, use tolist and recreate a dataframe that you stack to get one dictionary {'id':...,'value':...} per row.

s = pd.DataFrame(df['column2'].tolist()).stack()
print(s)
# 0  0    {'id': 1144801690551941, 'value': 20}
#    1    {'id': 8202109018383881, 'value': 26}
#    2    {'id': 3025222222235562, 'value': 37}
#    3    {'id': 5834245818862827, 'value': 35}
#    4    {'id': 4689782481420271, 'value': 27}
#    5    {'id': 7385168421196875, 'value': 56}
# 1  0    {'id': 1144801690551941, 'value': 25}
#    1    {'id': 8202109018383881, 'value': 26}

Then from there, use again tolist on this Series s and create a Dataframe, ensure to keep the original index. Append the column id just created with set_index, and unstack to get all id number as column header. You get the wanted shape for the id-value. Just need to join to column1.

res = (
    df[['column1']]
      .join(pd.DataFrame(s.tolist(), 
                         s.index.get_level_values(0)) # keep original index
              .set_index('id', append=True)
              ['value'].unstack()
              .rename_axis(columns=None))
)

and you get as expected

print(res)
   column1  1144801690551941  3025222222235562  4689782481420271  \
0        1                20                37                27   
1        2                25                38                21   
2        3                20                37                27   

   5834245818862827  7385168421196875  8202109018383881  
0                35                56                26  
1                35                53                26  
2                32                50                29

How to convert a list of dictionaries to a Pandas Dataframe with one of the values as column name?

Tags:

python

pandas

Pallav Doshi

1 Answers

Ben.T

Recent Activity

Donate For Us

How to convert a list of dictionaries to a Pandas Dataframe with one of the values as column name?

Tags:

python

pandas

Pallav Doshi

1 Answers

Ben.T

Related questions

Recent Activity

Donate For Us