I want to combine/reduce a list of dataframes into one dataframe, but I also want to summarize the data in one step. The output is from a simulation; therefore, each dataframe has the same output structure (i.e., a Group column, then 2 columns with values, which will have values that vary for each output).
Minimal Reproducible Example
df_list <- list(structure(list(Group = c("A", "B", "C"), Top_Group = c(1L,
0L, 0L), Efficiency = c(0.464688158128411, 0.652386676520109,
0.282913417555392)), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(Group = c("A", "B", "C"
), Top_Group = c(0L, 1L, 0L), Efficiency = c(0.120292583014816,
0.0356206290889531, 0.37196880299598)), row.names = c(NA, -3L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
Group = c("A", "B", "C"), Top_Group = c(0L, 1L, 0L), Efficiency = c(0.261322160949931,
0.383351784432307, 0.754808459430933)), row.names = c(NA,
-3L), class = c("tbl_df", "tbl", "data.frame")))
What I Have Tried
I know I could bind the data together, then group and summarize.
library(tidyverse)
df_list %>%
bind_rows() %>%
group_by(Group) %>%
summarise(Top_Group = sum(Top_Group), Efficiency = max(Efficiency))
# Group Top_Group Efficiency
# <chr> <int> <dbl>
#1 A 1 0.465
#2 B 2 0.652
#3 C 0 0.755
I was hoping that there was someway to use something like reduce; however, I can only get it to work for pulling out one column (like Top_Group shown here), and am unsure how to use across all columns (if possible) and return a dataframe instead of vectors.
df_list %>%
map(2) %>%
reduce(`+`)
# [1] 1 2 0
Expected Output
Group Top_Group Efficiency
<chr> <int> <dbl>
1 A 1 0.465
2 B 2 0.652
3 C 0 0.755
In base R you could just do
Reduce(function(a, b) cbind(a[1], a[2] + b[2], pmax(a[3], b[3])), df_list)
#> Group Top_Group Efficiency
#> 1 A 1 0.4646882
#> 2 B 2 0.6523867
#> 3 C 0 0.7548085
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With