I am trying to make a bar chart of student scores by homework problem using pandas/matplotlib. I can make the bar chart no problem, but what I would like to do is select the color by the student score. For example, I am hoping that I can make scores <= 50 red, scores > 50 and <=75 yellow, etc.
Here is a where I'm currently at
import pandas as pd
import matplotlib.pyplot as plt
# make some arrays
score = [100, 50, 43, 67, 89, 2, 13, 56, 22, -1, 53]
homework_problem = ['A', 'B', 'C', 'B', 'A', 'D', 'D', 'A', 'C', 'D', 'B']
topic = ['F', 'G', 'H', 'G', 'H', 'F', 'H', 'G', 'G', 'F', 'H']
# put the arrays into a pandas df
df = pd.DataFrame()
df['score'] = score
df['homework_problem'] = homework_problem
df['topic'] = topic
#make sure it looks okay
print(df)
# let's groupby and plot
df.groupby(['homework_problem','score'])['topic'].size().unstack().plot(kind='bar',stacked=True, title = "Test")
plt.show()
which outputs the plot below
You can try this:
# make some arrays
score = [100, 50, 43, 67, 89, 2, 13, 56, 22, -1, 53]
homework_problem = ['A', 'B', 'C', 'B', 'A', 'D', 'D', 'A', 'C', 'D', 'B']
topic = ['F', 'G', 'H', 'G', 'H', 'F', 'H', 'G', 'G', 'F', 'H']
# put the arrays into a pandas df
df = pd.DataFrame()
df['score'] = score
df['homework_problem'] = homework_problem
df['topic'] = topic
df['scoregroup'] = pd.cut(df['score'],bins=[0,50,75,100], labels=['Poor','Bad','Good'])
#make sure it looks okay
print(df)
# let's groupby and plot
d = df.groupby(['homework_problem','scoregroup'])['topic'].size().unstack()
d.plot(kind='bar',stacked=True, title = "Test")
Output:
score homework_problem topic scoregroup
0 100 A F Good
1 50 B G Poor
2 43 C H Poor
3 67 B G Bad
4 89 A H Good
5 2 D F Poor
6 13 D H Poor
7 56 A G Bad
8 22 C G Poor
9 -1 D F NaN
10 53 B H Bad
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With