Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to solve the "no non-missing arguments to max; returning -Inf " message when use max() function

Tags:

r

dplyr

When I process the following data containing NA the max function prompts a warning message. For the following data df, I want to calculate the maximum value for each group.

library(dplyr)
df <- data.frame(Var_1 = c("Grp 1", "Grp 1", "Grp 1", "Grp 2", "Grp 2", "Grp 2", "Grp 3", "Grp 3", "Grp 3"),
                 Var_2 = c(1,2,3, NA, NA, NA, 7, NA, 9))

ck <- df %>%
      group_by(Var_1) %>% 
      summarise(max_var = max(Var_2, na.rm = TRUE))

print(ck)

  Var_1 max_var
  <chr>   <dbl>
1 Grp 1       3
2 Grp 2    -Inf
3 Grp 3       9

If I don't add condition na.rm = TRUE to the max function, the result is NA for Grp 3. If I add the condition na.rm = TRUE, the records for Group 2 are deleted, and then the warning message "no non-missing arguments to max; returning -Inf" is thrown.

The result I was hoping for was

Var_1 max_var
Grp 1 3
Grp 2 NA
Grp 3 9

Does anyone have any suggestions on how to handle this WARNING message and results? Thanks.

I've tried deleting all the NA results first and then calculating with summarize(max()), but this causes Grp 2 to disappear. I want to keep this group and let the result is NA.

df <- data.frame(Var_1 = c("Grp 1", "Grp 1", "Grp 1", "Grp 2", "Grp 2", "Grp 2", "Grp 3", "Grp 3", "Grp 3"),
                 Var_2 = c(1,2,3, NA, NA, NA, 7, NA, 9))


ck <- df %>%
      filter(!is.na(Var_2)) %>%
      group_by(Var_1) %>% 
      summarise(max_var = max(Var_2, na.rm = TRUE))

print(ck)

  Var_1 max_var
  <chr>   <dbl>
1 Grp 1       3
2 Grp 3       9
like image 790
Songlin Tong Avatar asked Oct 16 '25 04:10

Songlin Tong


2 Answers

Another option could be first slicing the maximum value per group using slice_max and after that remove the duplicated rows for the group with only NA's like this:

library(dplyr)

ck <- df %>%
  group_by(Var_1) %>% 
  slice_max(Var_2) %>%
  distinct()

print(ck)
#> # A tibble: 3 × 2
#> # Groups:   Var_1 [3]
#>   Var_1 Var_2
#>   <chr> <dbl>
#> 1 Grp 1     3
#> 2 Grp 2    NA
#> 3 Grp 3     9

Created on 2024-04-23 with reprex v2.1.0

like image 106
Quinten Avatar answered Oct 17 '25 17:10

Quinten


You can create your own function to return NA conditional on all of the values being NA:

fn <- function(x, na.rm=FALSE) {
   if(all(is.na(x))) NA else max(x, na.rm = na.rm)
}

df %>%
  summarise(max_var = fn(Var_2, na.rm = TRUE), .by=Var_1)

  Var_1 max_var
1 Grp 1       3
2 Grp 2      NA
3 Grp 3       9
like image 21
Edward Avatar answered Oct 17 '25 19:10

Edward