I have a datadrame which looks like:
A B
0 2.0 'C=4;D=5;'
1 2.0 'C=4;D=5;'
2 2.0 'C=4;D=5;'
I can parse the string in column B, lets say using a function name parse_col(), in to a dict that looks like:
{C: 4, D: 5}
How can I add the 2 extra column to the data frame so it would look like that:
A B C D
0 2.0 'C=4;D=5;' 4 5
1 2.0 'C=4;D=5;' 4 5
2 2.0 'C=4;D=5;' 4 5
I can take only the specific column, parse it and add it but its clearly not the best way.
I also tried using a variation of the example in pandas apply documentation but I didn't manage to make it work only on a specific column.
We can use Series.str.extractall and then chain it with unstack to pivot the rows to columns:
df[['C', 'D']] = df['B'].str.extractall('(\d+)').unstack()
A B C D
0 2.0 'C=4;D=5;' 4 5
1 2.0 'C=4;D=5;' 4 5
2 2.0 'C=4;D=5;' 4 5
You can use df.eval and functools.reduce, this way you can read the column names directly:
>>> from functools import reduce
>>> reduce(
lambda x,y: x.eval(y),
df.B.str
.extractall(r'([A-Za-z]=\d+)')
.unstack().xs(0), df
)
A B C D
0 2.0 'C=4;D=5;' 4 5
1 2.0 'C=4;D=5;' 4 5
2 2.0 'C=4;D=5;' 4 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With