Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas assign with str columns

I really love the pandas.assign() function, especially in combination with the lambda expression. However, I ran into an unknown behavior when dealing with string concatenation that I don't understand yet. I have found this thread, but it does not answer my question: String concatenation of two pandas columns

Minimal working example of my problem:

import pandas as pd
df = pd.DataFrame({'Firstname': ['Sandy', 'Peter', 'Dolly'],
                   'Surname': ['Sunshine', 'Parker', 'Dumb']})

which returns

  Firstname   Surname
0     Sandy  Sunshine
1     Peter    Parker
2     Dolly      Dumb

Now, if I'd like to assign e.g. Full Name I thought I could simply do:

df = df.assign(**{'Full Name': lambda x: f'{x.Firstname} {x.Surname}'})

but this does not just create a new string like "Sandy Sunshine" based on each row (as expected) but on all rows like this:

weird_pandas_assign_behavior

Could anyone explain me why my approach does not work and why this

df = df.assign(**{'Full Name': lambda x: x.Firstname + ' ' + x.Surname})

obviously works? Thank you :)

like image 939
stack4science Avatar asked Nov 04 '25 16:11

stack4science


1 Answers

df.assign(**{'Full Name': lambda x: f'{x.Firstname} {x.Surname}'})

That's where you are doing wrong.

f-strings keep whatever that is processed in the {} to the string. Example:

print(f"Hello {df} world")
hello  0    Sandy
1    Peter
2    Dolly
Name: Firstname, dtype: object world

So, the output of f'{x.Firstname} {x.Surname}' would be

0    Sandy
1    Peter
2    Dolly
Name: Firstname, dtype: object 0    Sunshine
1      Parker
2        Dumb
Name: Surname, dtype: object

Now df.assign(new_col = 'a') would output:

 Firstname   Surname new_col
0     Sandy  Sunshine       a
1     Peter    Parker       a
2     Dolly      Dumb       a

That's the reason why you got the below string in every row.

0    Sandy
1    Peter
2    Dolly
Name: Firstname, dtype: object 0    Sunshine
1      Parker
2        Dumb
Name: Surname, dtype: object

In second case:

df.assign(**{'Full Name': lambda x: x.Firstname + ' ' + x.Surname})

Equivalent to

df.assign(Full_name = df['Firstname'] + ' ' + df['Surname']

It' just string concatenation element-wise so it worked as intended.

You can use pd.Series.str.cat here.

df['Full Name'] = df['Firstname'].str.cat(df['Surname'],sep=' ')
like image 149
Ch3steR Avatar answered Nov 06 '25 07:11

Ch3steR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!