Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a pandas column containing dictionary into multiple rows

Tags:

python

pandas

I have this Dataframe

temp = pd.DataFrame({'Person': ['P1', 'P2'], 'Dictionary': [{'value1': 0.31, 'value2': 0.304}, {'value2': 0.324}]})

  Person                    Dictionary    
0  P1  {'value1': 0.31, 'value2': 0.304}
1  P2                  {'value2': 0.324}

I want an output in this format:

temp1 = pd.DataFrame({'Person': ['P1', 'P1', 'P2'], 'Values_Number': ['value1', 'value2', 'value2'], 'Values': [0.31, 0.304, 0.324]})

I tried using this:

temp['Dictionary'].apply(pd.Series).T.reset_index()
  Person Values_Number  Values
0     P1        value1   0.310
1     P1        value2   0.304
2     P2        value2   0.324

But i am not able to concat this with the previous Dataframe. Also, we would be chances of error.

like image 661
TayyabRahmani Avatar asked Sep 01 '25 22:09

TayyabRahmani


1 Answers

IIUC, We could useSeries.tolist in order to build a new DataFrame that we can melt with DataFrame.melt

new_df = (pd.DataFrame(temp['Dictionary'].tolist(), index=temp['Person'])
            .reset_index()
            .melt('Person', var_name='Values_Number', value_name='Values')
            .dropna()
            .reset_index(drop=True))
print(new_df)

  Person Values_Number  Values
0     P1        value1   0.310
1     P1        value2   0.304
2     P2        value2   0.324

it is much more efficient to use pd.DataFrame(df['Dictionary'].tolist()) than .apply(pd.Series). You can see when you should use apply in you code here


This is result for apply(pd.Series) obtained in this publication.

%timeit s.apply(pd.Series)
%timeit pd.DataFrame(s.tolist())

2.65 ms ± 294 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
816 µs ± 40.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
like image 62
ansev Avatar answered Sep 03 '25 20:09

ansev