I'm working on the Chicago crimes dataset and I created a dataframe called primary which is just the type of crime. Then I grouped by the type of crime and got its count. This is the relevant code.
primary = crimes2012[['Primary Type']].copy()
test=primary.groupby('PrimaryType').size().sort_values().reset_index(name='Count')
Now I have a dataframe 'test' which has the crimes and their count. What I want to do it merge together certain crimes. For example, "Non-Criminal" and "Non - Criminal" and "Non-Criminal(Subject Specified)". But because they're rows now I don't know how to do it. I was trying to use .loc[]
I also tried using
test['Primary Type'=='NON-CRIMINAL'] = test['Primary Type'=='NON - CRIMINAL']+test['Primary Type'=='NON-CRIMINAL']+test['Primary Type'=='NON-CRIMINAL (SUBJECT SPECIFIED)']
but of course that only returned a Boolean value of false
You can look at map or apply here - https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html
You will have to create a mapping of your inputs to desired outputs as a dictionary
desired_output = {"NON CRIMINAL": "NON-CRIMINAL", "NC": "NON-CRIMINAL", ...}
and apply/map it to your primary series as follows -
primary = primary.map(desired_output)
And then groupby as you are doing now
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With