Python: How can I extend a DataFrame with multiply fields that calculated from a column

Question

I have a datadrame which looks like:

     A    B 
0  2.0  'C=4;D=5;'
1  2.0  'C=4;D=5;'
2  2.0  'C=4;D=5;'

I can parse the string in column B, lets say using a function name parse_col(), in to a dict that looks like:

{C: 4, D: 5}

How can I add the 2 extra column to the data frame so it would look like that:

     A    B          C   D
0  2.0  'C=4;D=5;'   4   5
1  2.0  'C=4;D=5;'   4   5
2  2.0  'C=4;D=5;'   4   5

I can take only the specific column, parse it and add it but its clearly not the best way.
I also tried using a variation of the example in pandas apply documentation but I didn't manage to make it work only on a specific column.

Erfan · Accepted Answer

We can use Series.str.extractall and then chain it with unstack to pivot the rows to columns:

df[['C', 'D']] = df['B'].str.extractall('(\d+)').unstack()

     A           B  C  D
0  2.0  'C=4;D=5;'  4  5
1  2.0  'C=4;D=5;'  4  5
2  2.0  'C=4;D=5;'  4  5

Sayandip Dutta · Answer

You can use df.eval and functools.reduce, this way you can read the column names directly:

>>> from functools import reduce
>>> reduce(
            lambda x,y: x.eval(y),
            df.B.str
                .extractall(r'([A-Za-z]=\d+)')
                .unstack().xs(0), df
            )

     A           B  C  D
0  2.0  'C=4;D=5;'  4  5
1  2.0  'C=4;D=5;'  4  5
2  2.0  'C=4;D=5;'  4  5

Python: How can I extend a DataFrame with multiply fields that calculated from a column

Tags:

python

pandas

Green

2 Answers

Erfan

Sayandip Dutta

Recent Activity

Donate For Us

Python: How can I extend a DataFrame with multiply fields that calculated from a column

Tags:

python

pandas

Green

2 Answers

Erfan

Sayandip Dutta

Related questions

Recent Activity

Donate For Us