Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting strings inside a pandas dataframe

I have a dataframe with one row that has a list like structure

import pandas as pd

df=pd.DataFrame({'Name':['Stooge, Nick','Dick, Tracy','Rick, Nike','Maw','El','Paw, Maw, Haw','Caw', 'Greep'],
'key':[2,2,2,1,1,3,1,1,],
'Lastname':['Smith, Foo','Johnson, Macy','Johnson, Sike','Simpson','Diablo','Simpson, Sampson, Simmons','Simpson', 'Mortimer']
})


df.ix[df['key'] == 2, 'Full'] =  df['Name']+', ' + df['Lastname']
df.ix[df['key'] == 1, 'Full'] = df['Name']+' ' + df['Lastname']
print(df)

Output:

                    Lastname           Name  key                        Full
0                 Smith, Foo   Stooge, Nick    2    Stooge, Nick, Smith, Foo
1              Johnson, Macy    Dick, Tracy    2  Dick, Tracy, Johnson, Macy
2              Johnson, Sike     Rick, Nike    2   Rick, Nike, Johnson, Sike
3                    Simpson            Maw    1                 Maw Simpson
4                     Diablo             El    1                   El Diablo
5  Simpson, Sampson, Simmons  Paw, Maw, Haw    3                         NaN
6                    Simpson            Caw    1                 Caw Simpson
7                   Mortimer          Greep    1              Greep Mortimer

Is there a way manipulate or split the string inside the dataframe by the comma so it produces results like:

                    Lastname           Name  key                        Full
0                 Smith, Foo   Stooge, Nick    2    Stooge Smith and Nick Foo
1              Johnson, Macy    Dick, Tracy    2  Dick Johnson and Tracy Macy
2              Johnson, Sike     Rick, Nike    2   Rick Johnson and Nike Sike
3                    Simpson            Maw    1                 Maw Simpson
4                     Diablo             El    1                   El Diablo
5  Simpson, Sampson, Simmons  Paw, Maw, Haw    3                         NaN
6                    Simpson            Caw    1                 Caw Simpson
7                   Mortimer          Greep    1              Greep Mortimer
like image 319
ccsv Avatar asked Dec 05 '25 21:12

ccsv


1 Answers

ln = df.Lastname.str.split(r',\s*', expand=True).stack()
fn = df.Name.str.split(r',\s*', expand=True).stack()
df['full'] = fn.add(' ').add(ln).groupby(level=0).apply(tuple).str.join(' and ')
df

enter image description here

like image 134
piRSquared Avatar answered Dec 08 '25 10:12

piRSquared



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!