Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In R, how do I drop a column whose values are all FALSE?

I have a dataframe, df. Some of its columns include logicals. I would like to drop the ones that are all FALSE.

library(tibble)
df <- tibble(A = rep(TRUE, 5),
             B = rep(FALSE, 5),
             C = c(TRUE, FALSE, TRUE, TRUE, FALSE))

df

# A tibble: 5 x 3
  A     B     C    
  <lgl> <lgl> <lgl>
1 TRUE  FALSE TRUE 
2 TRUE  FALSE FALSE
3 TRUE  FALSE TRUE 
4 TRUE  FALSE TRUE 
5 TRUE  FALSE FALSE

The desired output is:

  A     C    
  <lgl> <lgl>
1 TRUE  TRUE 
2 TRUE  FALSE
3 TRUE  TRUE 
4 TRUE  TRUE 
5 TRUE  FALSE

I have tried selecting constant columns using the janitor package, but that will remove columns that are all TRUE also.

How may I do this? (I prefer a tidyverse solution, but barring that base R or some other available package is acceptable.)

Edit: My minimal working example above was too minimal. I should have mentioned that there are non-logical columns I want to keep too. The solution for me, provided by akrun in chat, was:

library(dplyr)
library(purrr)
df %>% select(where(~ is.logical(.) && any(.)), where(negate(is.logical)))
like image 797
Rob Creel Avatar asked Oct 16 '25 02:10

Rob Creel


2 Answers

Using base R with Filter and any

Filter(any, df)

Or in dplyr

library(dplyr)
df %>%
    select(where(any))

-output

# A tibble: 5 x 2
#  A     C    
#  <lgl> <lgl>
#1 TRUE  TRUE 
#2 TRUE  FALSE
#3 TRUE  TRUE 
#4 TRUE  TRUE 
#5 TRUE  FALSE

Based on the OP's comments, wanted to keep columns that are not logical in type along with columns with logical type and any TRUE

library(purrr)
df %>% 
  select(where(~ is.logical(.) && any(.)), where(negate(is.logical)))
like image 123
akrun Avatar answered Oct 17 '25 15:10

akrun


That's cleanest solution:

select_if(.tbl = df, .predicate = any)

Explanation:

  • .predicate - applied to columns, will leave columns for which returned values are all TRUE
  • any - will return TRUE for any TRUE values present. Will also work for combinations any(0,-1). There is one edge case where any(0, 0) would return FALSE.
    • If you may have a column that may contain only 0s you may want implement additional check. Again that would be equivalent to any(NULL, NULL)

Let's say that you want to avoid those edge cases, better option:

select_if(.tbl = df, .predicate = ~ all(isFALSE(.x)))
like image 40
Konrad Avatar answered Oct 17 '25 16:10

Konrad



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!