Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering a field with multiple values pandas python

Quite a basic question, apologies if its been asked before but couldnt find the answer.

Trying to filter a dataset based on gender so that I can see the girl-boy sales split, but the data is done by title i.e. Mr, Mrs, Miss & Ms.

I have for men:

men = cd.loc[cd.title_desc == "MR", "SALES"]

For women I want MR, MRS & MISS included i.e.

women = cd.loc[cd.title_desc == "MRS" and "MISS" and "MS", "SALES"]

but obviously the "and" isn't correct.

Help appreciated!

like image 452
pow Avatar asked Sep 02 '25 05:09

pow


2 Answers

This has definitely been asked before, but here you go.

To create two different Series objects by filtering on multiple values:

men = cd.loc[cd.title_desc == 'MR','SALES']
women = cd.loc[cd.title_desc.isin(['MRS','MISS','MS']), 'SALES']

Alternatively, if you want to go straight to total sales by gender:

cd['gender'] = ''
cd.loc[cd.title_desc == 'MR', 'gender'] = 'men'
cd.loc[cd.title_desc.isin(['MRS','MISS','MS']), 'gender'] = 'women'
cd.groupby('gender').agg({'SALES': sum})
like image 145
jack6e Avatar answered Sep 04 '25 18:09

jack6e


You have to break it up into multiple logical statements, which you can then combine with the logical or operator '|'. The resulting boolean vector can be used with .loc

bvec = (cd.title_desc == "MRS") | (cd.title_desc == "MISS") | (cd.title_desc == "MS")
women = cd.loc[bvec,"SALES"]
like image 20
grg rsr Avatar answered Sep 04 '25 19:09

grg rsr