I have data as follows:
ID Gender Country ...
1 Male UK
2 Female US
3 Male NZ
4 Female UK
...
There are only 2 options for gender and 3 for country. I would like to create a seperate pie chart for both "Gender" and "Country" to show how many times each option shows up in the data but I'm quite confused about how to do so.
The data is stored in a pandas dataframe.
Any and all help is much appreciated!
Here is an approach using pandas:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
def label_function(val):
return f'{val / 100 * len(df):.0f}\n{val:.0f}%'
N = 50
df = pd.DataFrame({'country': np.random.choice(['UK', 'US', 'NZ'], N),
'gender': np.random.choice(['Male', 'Female'], N)})
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(10, 5))
df.groupby('country').size().plot(kind='pie', autopct=label_function, textprops={'fontsize': 20},
colors=['tomato', 'gold', 'skyblue'], ax=ax1)
df.groupby('gender').size().plot(kind='pie', autopct=label_function, textprops={'fontsize': 20},
colors=['violet', 'lime'], ax=ax2)
ax1.set_ylabel('Per country', size=22)
ax2.set_ylabel('Per gender', size=22)plt.tight_layout()
plt.show()

PS: To just show the percentage, use autopct='%1.0f%%'.
I assume you started with this
import pandas as pd
from matplotlib.pyplot import pie, axis, show
df = pd.DataFrame([[1,'Male','UK'], [2, 'Female', 'NZ'], [3, 'Male', 'UK'], [4, 'Male', 'US']], columns=['ID', 'Gender', 'Country'])
Plot for gender
df.groupby('gender').size().plot(kind='pie', autopct='%.2f')

Plot for country
df.groupby('country').size().plot(kind='pie', autopct='%.2f')

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With