pandas Dataframe groupby, sort groups by absolute value

Question

Hi everyone I basically want to find an efficient way to sort the grouped data by the absolute value.

For example:

item       itemID value
cars         A     5
             B    -3
             C     2
             D    -4
             E     1
houses       A    -2
             B     4
             C    -6
             D     3
             E     7

Should be:

item        itemID value
 car          A      5
              D     -4             
              B     -3
              C      2
              E      1
houses        E      7
              C     -6
              B      4
              D      3
              A     -2

Here is the dataframe and groupby for reference:

data = {'item':['car','car','car','car','car','houses','houses','houses','houses','houses'], 'itemID':['A','B','C','D','E','A','B','C','D','E'],'value':[5,-3,2,-4,1,-2,4,-6,3,7]}
df = pd.DataFrame(data)
gdf = df.groupby('item')

I've tried this:

gdf.apply(lambda g: g.reindex(g[['value']].abs().sort('value', ascending=True).index))

and it works fine most of the times but some times it gives me the error

ValueError: Shape of passed values is (100,10), indices imply (105, 10)

I don't really get this error in the provided data set but I use it in large and different datasets which I can't provide here and get it in some of them but I'm sure the data have nothing to do with it since they are all very similar.

I've done some debugging and everytime I get this error is when apply duplicates the first group.

So are there any better ways to do it without using apply?

Note: I tried transform but it gets rid of the groups and outputs a different dataset which is definitely not what I want, I want to keep the groups and the format. Maybe I'm using it wrong?

Parfait · Accepted Answer

Consider simply creating an absolute value column through a defined function, apply the function on a groupby, and then sorting item ascending and absolute value descending. Finally, filter out the newly created, unneeded column:

# CREATE ABS VALUE FUNCTION TO CREATE COLUMN
def valsort(row):    
    row['absvalue'] = row['value'].abs()
    return row

# APPLY FUNCTION AND RESET DATA FRAME
gdf = df.groupby(['item', 'itemID']).apply(valsort).sort(['item', 'absvalue'], 
                                                    ascending=[1,0]).reset_index()

# FILTER OUT ABS VALUE
gdf = gdf[['item', 'itemID', 'value']]

print(gdf)

OUTPUT

     item itemID  value
0     car      A      5
1     car      D     -4
2     car      B     -3
3     car      C      2
4     car      E      1
5  houses      E      7
6  houses      C     -6
7  houses      B      4
8  houses      D      3
9  houses      A     -2

Nader Hisham · Answer

In [48]:
df['value'] = df.groupby(df.index)['value'].apply(lambda x : x[np.argsort(np.abs(x))][::-1])
df
Out[48]:
     itemID value
item        
cars    A   5
cars    B   -4
cars    C   -3
cars    D   2
cars    E   1
houses  A   7
houses  B   -6
houses  C   4
houses  D   3
houses  E   -2

pandas Dataframe groupby, sort groups by absolute value

Tags:

python

pandas

dataframe

numpy

Blade Cutts

2 Answers

Parfait

Nader Hisham

Recent Activity

Donate For Us

pandas Dataframe groupby, sort groups by absolute value

Tags:

python

pandas

dataframe

numpy

Blade Cutts

2 Answers

Parfait

Nader Hisham

Related questions

Recent Activity

Donate For Us