Drop rows that do not contain a string within a value using pandas

Question

I want to use python pandas to drop rows in a spreadsheet that do not contain "CAT" within the "Equipment" column. Here is what I have so far:

import pandas as pd
import openpyxl
import warnings
import xlrd

warnings.filterwarnings("ignore")

file_name = input("Enter File Name: ")
df = pd.read_csv(file_name)


for i in range(len(df["Equipment"])):
    if "CAT" in df["Equipment"][i]:
        print (df["Equipment"][i])
    else:
        df.drop([i])

df.to_excel("new_mp.xlsx")

The script prints out the correct values in the terminal but does not drop the rows in the new spreadsheet "new_mp.xlsx". Can someone point me in the right direction?

Example Data:

Equipment
CAT 259B3 1818 OE Skid Steer 2011 CAT
T-14 Towmaster Trailer 3124 OE Trailer 2008
CAT 938M Loader RPO RENTAL 2017 CAT 938M
Rental Water Truck 55571 Rental Water Truck international water truck

cs95 · Accepted Answer

You don't need a loop here; you can do this with str.contains:

v = df[df["Equipment"].str.contains('CAT')]
print(v)
                                  Equipment
0     CAT 259B3 1818 OE Skid Steer 2011 CAT
2  CAT 938M Loader RPO RENTAL 2017 CAT 938M

Or, use query with engine='python':

v = df.query('Equipment.str.contains("CAT")', engine='python')
print(v)
                                  Equipment
0     CAT 259B3 1818 OE Skid Steer 2011 CAT
2  CAT 938M Loader RPO RENTAL 2017 CAT 938M

Finally, write to disk:

v.to_excel("new_mp.xlsx")

Omkar Sabade · Answer

You can use this to filter the rows.

df[df['Equipment'].apply(lambda x: 'CAT' in x)].to_excel("new_mp.xlsx")

Edit:

Alright, let me explain this inside out :

lambda x: 'CAT' in x takes as input x and returns True if 'CAT' is in x.

Now the df['Equipment'].apply method will invoke the lambda function mentioned above on every value of the series. The result will be a boolean array which is then passed on to the df as boolean mask.

Hope I didn't overdo it.

Drop rows that do not contain a string within a value using pandas

Tags:

python

string

pandas

cbennett423

2 Answers

cs95

Omkar Sabade

Recent Activity

Donate For Us

Drop rows that do not contain a string within a value using pandas

Tags:

python

string

pandas

cbennett423

2 Answers

cs95

Omkar Sabade

Related questions

Recent Activity

Donate For Us