I've a Dataframe in this format:
| Department | Person | Power | ... |
|------------|--------|--------|-----|
| ABC | 1234 | 75 | ... |
| ABC | 1235 | 25 | ... |
| DEF | 1236 | 50 | ... |
| DEF | 1237 | 100 | ... |
| DEF | 1238 | 25 | ... |
| DEF | 1239 | 50 | ... |
What I now want to get is the sum of occurrences for each value in the power column. How can I get this from my DataFrame?
| Department | 100 | 75 | 50 | 25 |
|------------|-----|-----|-----|-----|
| ABC | 0 | 1 | 0 | 1 |
| DEF | 1 | 0 | 2 | 1 |
You can use value_counts with sort_index, then generate DataFrame by to_frame and last transpose by T:
print (df.Power.value_counts().sort_index(ascending=False).to_frame().T)
100 75 50 25
Power 1 1 2 2
EDIT by comment:
You need crosstab:
print (pd.crosstab(df.Department, df.Power).sort_index(axis=1, ascending=False))
Power 100 75 50 25
Department
ABC 0 1 0 1
DEF 1 0 2 1
Faster another solution with groupby and unstack:
print (df.groupby(['Department','Power'])
.size()
.unstack(fill_value=0)
.sort_index(axis=1, ascending=False))
Power 100 75 50 25
Department
ABC 0 1 0 1
DEF 1 0 2 1
If need groupby by columns Department and Person, add column Person to groupby to second position (thank you piRSquared):
print (df.groupby(['Department','Person', 'Power'])
.size()
.unstack(fill_value=0)
.sort_index(axis=1, ascending=False))
Power 100 75 50 25
Department Person
ABC 1234 0 1 0 0
1235 0 0 0 1
DEF 1236 0 0 1 0
1237 1 0 0 0
1238 0 0 0 1
1239 0 0 1 0
EDIT1 by comment:
If need add another missing values, use reindex:
print (df.groupby(['Department','Power'])
.size()
.unstack(fill_value=0)
.reindex(columns=[100,75,50,25,0], fill_value=0))
Power 100 75 50 25 0
Department
ABC 0 1 0 1 0
DEF 1 0 2 1 0
or it can be done this way:
>>> df.groupby(['Department','Power']).count().unstack().fillna(0)
Person
Power 25 50 75 100
Department
ABC 1.0 0.0 1.0 0.0
DEF 1.0 2.0 0.0 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With