When I process the following data containing NA the max function prompts a warning message. For the following data df, I want to calculate the maximum value for each group.
library(dplyr)
df <- data.frame(Var_1 = c("Grp 1", "Grp 1", "Grp 1", "Grp 2", "Grp 2", "Grp 2", "Grp 3", "Grp 3", "Grp 3"),
Var_2 = c(1,2,3, NA, NA, NA, 7, NA, 9))
ck <- df %>%
group_by(Var_1) %>%
summarise(max_var = max(Var_2, na.rm = TRUE))
print(ck)
Var_1 max_var
<chr> <dbl>
1 Grp 1 3
2 Grp 2 -Inf
3 Grp 3 9
If I don't add condition na.rm = TRUE
to the max function, the result is NA for Grp 3. If I add the condition na.rm = TRUE
, the records for Group 2 are deleted, and then the warning message "no non-missing arguments to max; returning -Inf" is thrown.
The result I was hoping for was
Var_1 | max_var |
---|---|
Grp 1 | 3 |
Grp 2 | NA |
Grp 3 | 9 |
Does anyone have any suggestions on how to handle this WARNING message and results? Thanks.
I've tried deleting all the NA results first and then calculating with summarize(max()), but this causes Grp 2 to disappear. I want to keep this group and let the result is NA.
df <- data.frame(Var_1 = c("Grp 1", "Grp 1", "Grp 1", "Grp 2", "Grp 2", "Grp 2", "Grp 3", "Grp 3", "Grp 3"),
Var_2 = c(1,2,3, NA, NA, NA, 7, NA, 9))
ck <- df %>%
filter(!is.na(Var_2)) %>%
group_by(Var_1) %>%
summarise(max_var = max(Var_2, na.rm = TRUE))
print(ck)
Var_1 max_var
<chr> <dbl>
1 Grp 1 3
2 Grp 3 9
Another option could be first slicing the maximum value per group using slice_max
and after that remove the duplicated rows for the group with only NA's like this:
library(dplyr)
ck <- df %>%
group_by(Var_1) %>%
slice_max(Var_2) %>%
distinct()
print(ck)
#> # A tibble: 3 × 2
#> # Groups: Var_1 [3]
#> Var_1 Var_2
#> <chr> <dbl>
#> 1 Grp 1 3
#> 2 Grp 2 NA
#> 3 Grp 3 9
Created on 2024-04-23 with reprex v2.1.0
You can create your own function to return NA conditional on all of the values being NA:
fn <- function(x, na.rm=FALSE) {
if(all(is.na(x))) NA else max(x, na.rm = na.rm)
}
df %>%
summarise(max_var = fn(Var_2, na.rm = TRUE), .by=Var_1)
Var_1 max_var
1 Grp 1 3
2 Grp 2 NA
3 Grp 3 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With