Applying conditional to a subset of rows where rows are different formats in dplyr

Question

I want to flag rows in a dataset where all entries in a subset of columns are NA. The caveat to this is that some rows are strings and some are numbers.

Doing this with columns that are all numbers works fine. Let's say we want an indicator variable that tells us whether columns a to c in the following dataset are all NA

df <- data.frame(a = c(1,4,5,NA),
                 b = c(NA,4,2,NA),
                 c = c(3,7,8,NA),
                 d = c(2,1,1,1))

df

# output
   a  b  c d
1  1 NA  3 2
2  4  4  7 1
3  5  2  8 1
4 NA NA NA 1

Using dplyr we can create a variable indicating what we need

df %>%
  rowwise() %>%
    mutate(allNA = case_when(all(is.na(c_across(cols = a:c))) ~ "allNA",
                             TRUE ~ "notAllNA")) %>%
      ungroup()

# output
# A tibble: 4 × 5
      a     b     c     d allNA   
  <dbl> <dbl> <dbl> <dbl> <chr>   
1     1    NA     3     2 notAllNA
2     4     4     7     1 notAllNA
3     5     2     8     1 notAllNA
4    NA    NA    NA     1 allNA

But if one of the variables is a string instead of a number

df <- data.frame(a = c(1,4,5,NA),
                 b = c(NA,"ava","dillion",NA),
                 c = c(3,7,8,NA),
                 d = c(2,1,1,1))

df

# output
   a       b  c d
1  1    <NA>  3 2
2  4     ava  7 1
3  5 dillion  8 1
4 NA    <NA> NA 1

The same code throws the following error

Error in `mutate()`:
ℹ In argument: `allNA = case_when(...)`.
ℹ In row 1.
Caused by error in `case_when()`:
! Failed to evaluate the left-hand side of formula 1.
Caused by error in `vec_c()`:
! Can't combine `a` <double> and `b` <character>.
Run `rlang::last_trace()` to see where the error occurred.

Any help appreciated.

stefan · Accepted Answer

One option to achieve your desired result which allows for columns of different types would be to use rowSums with across:

library(dplyr, warn.conflicts = FALSE)

df %>%
  mutate(allNA = if_else(
    rowSums(across(a:c, ~ !is.na(.x))) == 0, "allNA", "notAllNA"
  ))
#>    a       b  c d    allNA
#> 1  1    <NA>  3 2 notAllNA
#> 2  4     ava  7 1 notAllNA
#> 3  5 dillion  8 1 notAllNA
#> 4 NA    <NA> NA 1    allNA

Applying conditional to a subset of rows where rows are different formats in dplyr

Tags:

r

dplyr

llewmills

1 Answers

stefan

Recent Activity

Donate For Us

Applying conditional to a subset of rows where rows are different formats in dplyr

Tags:

r

dplyr

llewmills

1 Answers

stefan

Related questions

Recent Activity

Donate For Us