I'd like to pick a different number of rows of each group of my data frame. I haven't figured out an elegant way to do this with dplyr yet. To pick out the same number of rows for each group I accomplish like this:
library(dplyr)
iris %>%
group_by(Species) %>%
arrange(Sepal.Length) %>%
top_n(2)
But I would like to be able to reference another table with the number of rows I'd like for each group, a sample table like this below:
top_rows_desired <- data.frame(Species = unique(iris$Species),
n_desired = c(4,2,5))
We can do a left_join with 'iris' and 'top_rows_desired' by 'Species', grouped by 'Species', slice the sequence of first 'n_desired' and remove the 'n_desired' column with select.
left_join(iris, top_rows_desired, by = "Species") %>%
group_by(Species) %>%
arrange(desc(Sepal.Length)) %>%
slice(seq(first(n_desired))) %>%
select(-n_desired)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With