Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

n highest values in dataframe

Tags:

python

pandas

max

I have a pandas data frame like:

        column0     column1     column2     column3     column4
row0    179319.0    180895.0    94962.0     130734.0    0
row1    89659.5     90447.5     47481.0     65367.0     0
row2    59773.0     60298.33333 31654.0     43578.0     0
row3    44829.75    45223.75    23740.5     32683.5     0
row4    35863.8     36179.0     18992.4     26146.8     0
row5    29886.5     30149.16666 15827.0     21789.0     0
row6    25617.0     25842.14285 13566.0     18676.28571 0
row7    22414.875   22611.875   11870.25    16341.75    0
row8    19924.33333 20099.44444 10551.33333 14526.0     0

and I would like to get something like the index of the 9 (number of rows) highest values, or something like the count of the highest values for each column like:

column0  column1  column2  column3  column4
3        3        1        2        0

In my example the 9 highest values would be the ones from column0, column1, column2, and column3 from row0, the ones from column0, column1, and column3 from row1, and the ones from column0 and column1 from row2.

Any ideas? Thanks!

like image 505
Oscar Avatar asked Dec 22 '25 21:12

Oscar


1 Answers

IIUC nlargest after stack

df.stack().nlargest(9).groupby(level=1).count().reindex(df.columns,fill_value=0)
Out[48]: 
column0    3
column1    3
column2    1
column3    2
column4    0
dtype: int64
like image 198
BENY Avatar answered Dec 24 '25 11:12

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!