Assigning values to a dataframe by iterating through two lists

Question

I have a dataframe of meaning housing prices by month that looks like this

RegionName    2000-01    2000-02    2000-03  

New York      200000     210000     220000
Austin        100000     110000     130000  ...
Los Angeles   180000     190000     200000

I have a list of lists of months corresponding to quarters and a list of quarters that look like

month_chunks = [['2000-01', '2000-02', '2000-03'], ['2000-04', '2000-05', '2000-06']...]

quarters = ['2000q1', '2000q2', '2000q3'...]

I'm trying to create columns in my dataframe that contain mean prices by quarter

for quarter, chunk in zip(quarters, month_chunks):
        housing[quarter] = np.mean(housing[chunk].mean())

RegionName    2000-01    2000-02    2000-03       2000q1   

New York      200000     210000     220000        210000    
Austin        100000     110000     130000  ...   113333.333 
Los Angeles   180000     190000     200000        190000

But it is giving me columns that are duplicated for each row

RegionName    2000-01    2000-02    2000-03       2000q1   

New York      200000     210000     220000        210000    
Austin        100000     110000     130000  ...   210000 
Los Angeles   180000     190000     200000        210000

The dataframe is large, so iterating through it and the lists is not doable

for i, row in housing.iterrows():
    for quarter, chunk in zip(quarters, month_chunks):
        row[quarter].iloc[i] = np.mean(row[chunk].iloc[i].mean())

Quang Hoang · Accepted Answer

Don't iterrows, you can perform your operation columns wise:

for months, qt in zip(month_chunks, quarters):
    housing[qt] = housing[months].mean(axis=1)

BENY · Answer

Here is one way using groupby

from collections import ChainMap
d=dict(ChainMap(*[dict.fromkeys(x,y)for x , y in zip(month_chunks,quarters)]))
s=housing.set_index('RegionName').groupby(d,axis=1).mean()
s
Out[32]: 
                   2000q1
RegionName               
NewYork     210000.000000
Austin      113333.333333
LosAngeles  190000.000000

df=pd.concat([housing.set_index('RegionName'),s],axis=1)

Assigning values to a dataframe by iterating through two lists

Tags:

python

pandas

mkh

2 Answers

Quang Hoang

BENY

Recent Activity

Donate For Us

Assigning values to a dataframe by iterating through two lists

Tags:

python

pandas

mkh

2 Answers

Quang Hoang

BENY

Related questions

Recent Activity

Donate For Us