I confuse to understand rank of series. I know that rank is calculated from the highest value to lowest value in a series. If two numbers are equal, then pandas calculates the average of the numbers.
In this example, the highest value is 7. why do we get rank 5.5 for number 7 and rank 1.5 for number 4 ?
S1 = pd.Series([7,6,7,5,4,4])
S1.rank()
Output:
0 5.5
1 4.0
2 5.5
3 3.0
4 1.5
5 1.5
dtype: float64
The Rank is calculated in this way
Elements - 4, 4, 5, 6, 7, 7 Ranks - 1, 2, 3, 4, 5, 6
Since we have '4' repeating twice, the final rank of each occurrence will be the average of 1,2 which is 1.5. In the same way or 7, final rank for each occurrence will be average of 5,6 which is 5.5
Elements - 4, 4, 5, 6, 7, 7 Ranks - 1, 2, 3, 4, 5, 6 Final Rank - 1.5, 1.5, 3, 4, 5.5, 5.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With