Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter n rows of grouped data frame when different n for each group

Tags:

r

dplyr

I'd like to pick a different number of rows of each group of my data frame. I haven't figured out an elegant way to do this with dplyr yet. To pick out the same number of rows for each group I accomplish like this:

library(dplyr)

iris %>% 
    group_by(Species) %>%
    arrange(Sepal.Length) %>%
    top_n(2)

But I would like to be able to reference another table with the number of rows I'd like for each group, a sample table like this below:

top_rows_desired <- data.frame(Species = unique(iris$Species),
    n_desired = c(4,2,5))
like image 611
cylondude Avatar asked Dec 06 '25 06:12

cylondude


1 Answers

We can do a left_join with 'iris' and 'top_rows_desired' by 'Species', grouped by 'Species', slice the sequence of first 'n_desired' and remove the 'n_desired' column with select.

left_join(iris, top_rows_desired, by = "Species") %>%
                     group_by(Species) %>% 
                     arrange(desc(Sepal.Length)) %>%
                     slice(seq(first(n_desired))) %>%
                     select(-n_desired)
like image 99
akrun Avatar answered Dec 07 '25 22:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!