Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R find the percentage of the way through an ordered vector that each value changes

I am looking for a way to take an ordered vector and return the percentage of the way through the vector that each value appears for the first time.

See below for the input vector and the expected result.

InputVector<-c(1,1,1,1,1,2,2,2,3,3)

ExpectedResult<-data.frame(Value=c(1,2,3), Percentile=c(0,0.5,0.8))

In this case, 1 appears at the 0th percentile, 2 at the 50th and 3 at the 80th.

like image 316
locket Avatar asked Sep 02 '25 06:09

locket


2 Answers

In base R, with rle and cumsum:

p <- with(rle(InputVector), cumsum(lengths) / sum(lengths))
c(0, p[-length(p)])
#[1] 0.0 0.5 0.8
like image 175
Maël Avatar answered Sep 04 '25 20:09

Maël


Using rank() and unique():

data.frame(
    Value = InputVector,
    Percentile = (rank(InputVector, ties.method = "min") - 1) / length(InputVector)
  ) |>
  unique()

#>   Value Percentile
#> 1     1        0.0
#> 6     2        0.5
#> 9     4        0.8

You could also use dplyr::percent_rank(), but note it computes percentiles differently:

library(dplyr)

tibble(
    Value = InputVector,
    Percentile = percent_rank(Value)
  ) %>% 
  distinct()

#> # A tibble: 3 × 2
#>   Value Percentile
#>   <dbl>      <dbl>
#> 1     1      0    
#> 2     2      0.556
#> 3     4      0.889

Created on 2022-11-09 with reprex v2.0.2

like image 26
zephryl Avatar answered Sep 04 '25 21:09

zephryl