Pandas groupby 0 value if does not exist

Question

I have a code like this

frame[frame['value_text'].str.match('Type 2')  | frame['value_text'].str.match('Type II diabetes')].groupby(['value_text','gender'])['value_text'].count()

which returns a series like

value_text            gender      count
type 2                  M           4
type 2 without...       M           4
                        F           3

what I want is

 value_text               gender      count
    type 2                  M           4
                            F           0
    type 2 without...       M           4
                            F           3

I want to include count for all genders even though there is no record in the dataframe. how can I do this?

jpp · Accepted Answer

Categorical Data was introduced in pandas specifically for this purpose.

In effect, groupby operations with categorical data automatically calculate the Cartesian product.

You should see additional benefits compared to other functional methods: lower memory usage and data validation.

import pandas as pd

df = pd.DataFrame({'value_text': ['type2', 'type2 without', 'type2'],
                   'gender': ['M', 'F', 'M'],
                   'value': [1, 2, 3]})

df['gender'] = df['gender'].astype('category')

res = df.groupby(['value_text', 'gender']).count()\
        .fillna(0).astype(int)\
        .reset_index()

print(res)

      value_text gender  value
0          type2      F      0
1          type2      M      2
2  type2 without      F      1
3  type2 without      M      0

Peter Leimbigler · Answer

Try appending .unstack().fillna(0).stack() to your current line, like so:

frame[frame['value_text'].str.match('Type 2')  |
      frame['value_text'].str.match('Type II diabetes')]\
.groupby(['value_text','gender'])['value_text'].count()\
.unstack().fillna(0).stack()

Pandas groupby 0 value if does not exist

Tags:

python

pandas

pandas-groupby

Bejita

2 Answers

jpp

Peter Leimbigler

Recent Activity

Donate For Us

Pandas groupby 0 value if does not exist

Tags:

python

pandas

pandas-groupby

Bejita

2 Answers

jpp

Peter Leimbigler

Related questions

Recent Activity

Donate For Us