Show percentiles of Variable A, while the classification of percentiles is based on Variable B

Question

I have a dataset that looks like the following:

INCOME	WEALTH
10.000	100000
15.000	111000
14.200	123456
12.654	654321

I have many more rows.

I now want to now find how much INCOME a household in a specific WEALTH percentile has. The following quantiles are relevant:

c(0.01,0.05,0.1,0.25,0.5,0.75,0.9,0.95,0.99)

I have always used the following code to get specific percentile values:

a <- quantile(WEALTH, probs = c(0.01,0.05,0.1,0.25,0.5,0.75,0.9,0.95,0.99))

But now I want to base my percentiles on WEALTH but get the respective INCOME. I have tried the following code but the results are not plausible:

df$percentile = ntile(df$WEALTH,100)
df <- df[df$percentile %in% c(1,5,10,25,50,75,90,95,99), ]

a <- df %>% 
  group_by(percentile) %>% 
  summarise(max = max(INCOME))

The results that I get a not consistent with other parts of the analysis that I have done. I assume that the percentile when using the "quantile" function are calculated differently that simply taking the maximum.

RYann · Accepted Answer

Im not sure if i understood your question correctly, but the quantile has different methods of calculation. I for example always go for number 6, since this is what i was taought in my stat courses.

type: an integer between 1 and 9 selecting one of the nine quantile algorithms detailed below to be used.

Read more about different types by using ?quantile commands (help on quantile)

Show percentiles of Variable A, while the classification of percentiles is based on Variable B

Tags:

r

percentile

quantile

Jakob

1 Answers

RYann

Recent Activity

Donate For Us

Show percentiles of Variable A, while the classification of percentiles is based on Variable B

Tags:

r

percentile

quantile

Jakob

1 Answers

RYann

Related questions

Recent Activity

Donate For Us