Pandas: groupby index and then fill the dataframe using function

Question

I have a dataframe which is like this:

df = pd.DataFrame({'pop1': ['1000', '2000','3000','4000'],
                   'pop2': ['2000', '3000','2000','2000']},
                  index=['female','female','male','male'])

How can I create a 2*2 DataFrame that gives the percent of the total population in a given year (the 2 columns) for a given gender (the two rows)?

jezrael · Accepted Answer

You need first cast string to int by astype, then groupby with aggregating sum and divide by div by sum. Last multiple 100:

df = df.astype(int)

a = df.groupby(level=0).sum()
print (a)
        pop1  pop2
female  3000  5000
male    7000  4000

b = df.sum()
print (b)
pop1    10000
pop2     9000
dtype: int64

print (a.div(b).mul(100))
        pop1       pop2
female  30.0  55.555556
male    70.0  44.444444

It is same as:

df = df.astype(int)
print (df.groupby(level=0).sum().div(df.sum()).mul(100))
        pop1       pop2
female  30.0  55.555556
male    70.0  44.444444

Karnage · Answer

Here is a one liner:

(df.astype(int) / df.astype(int).sum()).groupby(level=0).sum() * 100

It is a little prettier if you are already dealing with integers:

df = df.astype(int)
(df / df.sum()).groupby(level=0).sum() * 100

Put into words, after you convert the data into integers, you then divide each number by the total size of the relevant population, sum up those weights for each gender, and then multiply by 100 so the result looks like a percentage.

Pandas: groupby index and then fill the dataframe using function

Tags:

python

pandas

dataframe

Candice Zhang

2 Answers

jezrael

Karnage

Recent Activity

Donate For Us

Pandas: groupby index and then fill the dataframe using function

Tags:

python

pandas

dataframe

Candice Zhang

2 Answers

jezrael

Karnage

Related questions

Recent Activity

Donate For Us