I am working on a data frame in python. How can I indicate all the rows that have value for a particular column, 'rate', within specific quartile( ex q1, q2, q3, q4)? Here, interval is range of 'rate', so [-0, 0.913056] is entire range. I want to indicate the value of 'rate' in each row will fall into which quantile of the range.
name rate
0 3POWER ENERGY GROUP INC -0.000000
1 808 RENEWABLE ENERGY CORP -0.112192
2 YORK WATER CO 0.774955
3 ZTO EXPRESS (CAYM) INC -ADR 0.086352
4 AEP GENERATING CO 0.850960
5 AEP TEXAS CENTRAL CO 0.600301
6 AIR T INC 0.254511
7 ALABAMA GAS CORP 0.611631
8 ALABAMA POWER CO 0.913056
9 ALLEGIANT TRAVEL CO 0.227421
10 COMCAST CORP 0.012037
11 HAWAIIAN ELECTRIC CO 0.670980
12 HAWAIIAN ELECTRIC INDS 0.775778
df like this.
name rate quartile
0 3POWER ENERGY GROUP INC -0.000000 q1
1 808 RENEWABLE ENERGY CORP -0.112192 q1
2 YORK WATER CO 0.774955 q3
3 ZTO EXPRESS (CAYM) INC -ADR 0.086352 q1
4 AEP GENERATING CO 0.850960 q4
5 AEP TEXAS CENTRAL CO 0.600301 q3
6 AIR T INC 0.254511 q2
7 ALABAMA GAS CORP 0.611631 q3
8 ALABAMA POWER CO 0.913056 q4
9 ALLEGIANT TRAVEL CO 0.227421 q2
10 COMCAST CORP 0.012037 q1
11 HAWAIIAN ELECTRIC CO 0.670980 q4
12 HAWAIIAN ELECTRIC INDS 0.775778 q4
You need qcut:
df['quartile'] = pd.qcut(df['rate'], 4, ['q1','q2','q3','q4'])
print (df)
name rate quartile
0 3POWER ENERGY GROUP INC -0.000000 q1
1 808 RENEWABLE ENERGY CORP -0.112192 q1
2 YORK WATER CO 0.774955 q3
3 ZTO EXPRESS (CAYM) INC -ADR 0.086352 q1
4 AEP GENERATING CO 0.850960 q4
5 AEP TEXAS CENTRAL CO 0.600301 q2
6 AIR T INC 0.254511 q2
7 ALABAMA GAS CORP 0.611631 q3
8 ALABAMA POWER CO 0.913056 q4
9 ALLEGIANT TRAVEL CO 0.227421 q2
10 COMCAST CORP 0.012037 q1
11 HAWAIIAN ELECTRIC CO 0.670980 q3
12 HAWAIIAN ELECTRIC INDS 0.775778 q4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With