I am doing a value_counts() over a column of integers that represent categorical values. 
I have a dict that maps the numbers to strings that correspond to the category name.
I want to find the best way to have the index with the corresponding name. As I am not happy with my 4 lines solution.
df = pd.DataFrame({"weather": [1,2,1,3]})
df
>>>
   weather
0        1
1        2
2        1
3        3
weather_correspondance_dict = {1:"sunny", 2:"rainy", 3:"cloudy"}
Now how I solve the problem:
df_vc = df.weather.value_counts()
index = df_vc.index.map(lambda x: weather_correspondance_dict[x] )
df_vc.index = index
df_vc
>>>
sunny     2
cloudy    1
rainy     1
dtype: int64
I am not happy with that solution that is very tedious, do you have a best practice for that situation ?
This is my solution :
>>> weather_correspondance_dict = {1:"sunny", 2:"rainy", 3:"cloudy"}
>>> df["weather"].value_counts().rename(index=weather_correspondance_dict)
    sunny     2
    cloudy    1
    rainy     1
    Name: weather, dtype: int64
Here's a simpler solution:
weathers = ['sunny', 'rainy', 'cloudy']
weathers_dict = dict(enumerate(weathers, 1))
df_vc = df['weather'].value_counts()
df_vc.index = df_vc.index.map(weathers_dict.get)
Explanation
dict with enumerate to construct a dictionary mapping integers to a list of weather types.dict.get with pd.Index.map. Unlike pd.Series.apply, you cannot pass a dictionary directly, but you can pass a callable function instead.Alternatively, you can apply your map to weather before using pd.Series.value_counts. This way, you do not need to update the index of your result.
df['weather'] = df['weather'].map(weathers_dict)
df_vc = df['weather'].value_counts()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With