Find strings with UPPER case letters and ends with a certain word in regex

Question

I have a dataframe where one column consists of strings that have three patterns:

1) Upper case letters only: APPLE COMPANY

2) Upper case letters and ends with the letters AS: CAR COMPANY AS

3) Upper and lower case letters: John Smith

df = pd.DataFrame({'NAME': ['APPLE COMPANY', 'CAR COMPANY AS', 'John Smith']})

             NAME ...
0   APPLE COMPANY ...
1  CAR COMPANY AS ...
2      John Smith ...
3             ... ...

How can I take out those rows that do not meet the conditions of 2) and 3), i.e. 1)? In other words, how can I take out rows that only have UPPER case letters, does not end with AS or have both UPPER and LOWER letters in the string?

I came up with this:

df['NAME'].str.findall(r"(^[A-Z ':]+$)")
df['NAME'].str.findall('AS')

The first one extract strings with only upper letters, but second one only finds AS. If there are other methods than regex than I happy to try that as well.

Expected outcome is:

             NAME ...
1  CAR COMPANY AS ...
2      John Smith ...
3             ... ...

Sweeper · Accepted Answer

This regex should work:

^(?:[A-Z ':]+ AS|.*[a-z].*)$

It matches either one of these:

[A-Z ':]+ AS - The case of all uppercase letters followed by AS
.*[a-z].* - The case of lowercase letters

Demo

Mohamed Thasin ah · Answer

one way would be,

df['temp']=df['NAME'].str.extract("(^[A-Z ':]+$)")
s1=df['temp']==df["NAME"]
s2=~df['NAME'].str.endswith('AS')

print(df.loc[~(s1&s2), 'NAME'])

O/P:

1    CAR COMPANY AS
2        John Smith
Name: NAME, dtype: object

Find strings with UPPER case letters and ends with a certain word in regex

Tags:

python

regex

pandas

extract

Mataunited18

2 Answers

Demo

Sweeper

Mohamed Thasin ah

Recent Activity

Donate For Us

Find strings with UPPER case letters and ends with a certain word in regex

Tags:

python

regex

pandas

extract

Mataunited18

2 Answers

Demo

Sweeper

Mohamed Thasin ah

Related questions

Recent Activity

Donate For Us