I would like to delete all text after the 2nd comma to the left of strings in a dataframe that include "County, Texas". For example,
Before:
After:
Thank you for your help!
Use mask with str.contains() to perform the operation on rows with the specified condition, and then use the following operation: .str.split(', ').str[0:2].agg(', '.join)):
df['Col'] = df['Col'].mask(df['Col'].str.contains('County, Texas'),
df['Col'].str.split(', ').str[0:2].agg(', '.join))
Full Code:
import pandas as pd
df = pd.DataFrame({'Col': {0: 'Jack Smith, Bank, Wilber, Lincoln County, Texas',
1: 'Jack Smith, Union, Credit, Bank, Wilber, Lincoln County, Texas',
2: 'Jack Smith, Union, Credit, Bank, Wilber, Lincoln County, Texas, Branch, Landing, Services',
3: 'Jack Smith, Union, Credit, Bank, Wilber, Branch, Landing, Services'}})
df['Col'] = df['Col'].mask(df['Col'].str.contains('County, Texas'),
df['Col'].str.split(', ').str[0:2].agg(', '.join))
df
Out[1]:
Col
0 Jack Smith, Bank
1 Jack Smith, Union
2 Jack Smith, Union
3 Jack Smith, Union, Credit, Bank, Wilber, Branc...
Per the updated question, you can use np.select:
import pandas as pd
df = pd.DataFrame({'Col': {0: 'Jack Smith, Bank, Wilber, Lincoln County, Texas',
1: 'Jack Smith, Bank, Credit, Bank, Wilber, Lincoln County, Texas',
2: 'Jack Smith, Bank, Union, Credit, Bank, Wilber, Lincoln County, Texas, Branch, Landing, Services',
3: 'Jack Smith, Bank, Credit, Bank, Wilber, Branch, Landing, Services'}})
df['Col'] = np.select([df['Col'].str.contains('County, Texas') & ~df['Col'].str.contains('Union'),
df['Col'].str.contains('County, Texas') & df['Col'].str.contains('Union')],
[df['Col'].str.split(', ').str[0:2].agg(', '.join),
df['Col'].str.split(', ').str[0:3].agg(', '.join)],
df['Col'])
df
Out[2]:
Col
0 Jack Smith, Bank
1 Jack Smith, Bank
2 Jack Smith, Bank, Union
3 Jack Smith, Bank, Credit, Bank, Wilber, Branch...
You can simply use a combination of map with a lambda, split and join:
df['Example'] = df['Example'].map(lambda x: ','.join(x.split(',')[0:2]) if 'County, Texas' in x else x)
In this case:
import pandas as pd
df = pd.DataFrame({'Example':["Jack Smith, Bank, Wilber, Lincoln County, Texas","Jack Smith, Union, Credit, Bank, Wilber, Lincoln County, Texas",
"Jack Smith, Union, Credit, Bank, Wilber, Lincoln County, Texas, Branch, Landing, Services",
"Jack Smith, Union, Credit, Bank, Wilber, Branch, Landing, Services"]})
df['Example'] = df['Example'].map(lambda x: ','.join(x.split(',')[0:2]) if 'County, Texas' in x else x)
We get the following output:
Example
0 Jack Smith, Bank
1 Jack Smith, Union
2 Jack Smith, Union
3 Jack Smith, Union, Credit, Bank, Wilber, Branc...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With