Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parse a dict from the csv file python

I am reading a data from the csv file like :

import pandas as pd
data_1=pd.read_csv("sample.csv")
data_1.head(10)

It has two columns :

ID   detail
1    [{'a': 1, 'b': 1.85, 'c': 'aaaa', 'd': 6}, {'a': 2, 'b': 3.89, 'c': 'bbbb', 'd': 10}]

the detail column is not a json but it is a dict and I want to flatten the dict and want the result something like this :

ID  a   b     c     d
1   1   1.85  aaaa  6
1   2   3.89  bbbb  10

I always get a,b,c,d in the detail column and want to move the final results to a sql table.

Can someone please help me as how to solve it.

like image 531
xhang Avatar asked Dec 11 '25 19:12

xhang


1 Answers

Use dictionary comprehension with ast.literal for convert strings repr to list of dicts and convert it to DataFrame, then use concat and convert first level of MultiIndex to ID column:

import ast

d = {i: pd.DataFrame(ast.literal_eval(d)) for i, d in df[['ID','detail']].to_numpy()}
#for oldier pandas version use .values
#d = {i: pd.DataFrame(ast.literal_eval(d)) for i, d in df[['ID','detail']].values)}
df = pd.concat(d).reset_index(level=1, drop=True).rename_axis('ID').reset_index()
print (df)
   ID  a     b     c   d
0   1  1  1.85  aaaa   6
1   1  2  3.89  bbbb  10

Or use lsit comprehension with DataFrame.assign for ID column, only necessary change order of columns - last column to first:

import ast

L = [pd.DataFrame(ast.literal_eval(d)).assign(ID=i) for i, d in df[['ID','detail']].to_numpy()]
#for oldier pandas versions use .values
#L = [pd.DataFrame(ast.literal_eval(d)).assign(ID=i) for i, d in df[['ID','detail']].values]
df = pd.concat(L, ignore_index=True)
df = df[df.columns[-1:].tolist() + df.columns[:-1].tolist()]
print (df)
   ID  a     b     c   d
0   1  1  1.85  aaaa   6
1   1  2  3.89  bbbb  10

EDIT:

For 2 IDs change second solution:

d = [pd.DataFrame(ast.literal_eval(d)).assign(ID1=i1, ID2=i2) for i1, i2, d in df[['ID1','ID2','detail']].to_numpy()]
df = pd.concat(d)
df = df[df.columns[-2:].tolist() + df.columns[:-2].tolist()]
like image 65
jezrael Avatar answered Dec 14 '25 10:12

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!