I'd like to have a calculated field that gives me the percentile of a column's value in a table. What is the best way to do so?
I have a table with only one column containing values ranging from 0 to 10000, randomly distributed. I want to add another column to tell me what the percentile of the value on the same row is with respect to all the other values in the original column.
There are 2 definitions of percentile as explained here: http://onlinestatbook.com/chapter1/percentiles.html .
I am using the definition that I knew. Example: a value is at 25th percentile means 25% of the population is AT or below the value.
The algorithm I am thinking of is below, I hope someone can translate it to SQL for me because I am an SQL beginner:
For all the rows in my table, if the value is less than or equal to value in current row then count++. At the end of table, I do the division: count / number of rows to have my percentile. That's accurate enough for me because there are lots of rows.
[Edited to match comments in question]:
SELECT Number, 100 * (ROW_NUMBER() OVER (ORDER BY Number))/Count(*) AS Percentile
FROM MyTable
Then if your Numbers are 2, 5, 10, and 14, you should get:
Number Percentile
2 25
5 50
10 75
14 100
Here's the problem: if there are duplicates, then ROW_NUMBER will screw you up. If your Numbers are 2, 5, 5, and 14, the code above will give you:
Number Percentile
2 25
5 50
5 75
14 100
So... do this instead.
SELECT Number, MAX(100 * (ROW_NUMBER() OVER (ORDER BY Number))/Count(*)) AS Percentile
FROM MyTable
GROUP BY Number
If your numbers are 2, 5, 5, 14, it should return
Number Percentile
2 25
5 75
14 100
select CAST(SUM(CASE WHEN Number <= 20
THEN 1 ELSE 0 END) as float)/COUNT(*) as PercentileFor20
from Numbers;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With