Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove rows from a data frame that match on multiple criteria

Tags:

r

dplyr

I wish to remove rows of my data frame that contain a specific pattern and I wish to use tidyverse syntax if possible.

I wish to remove rows where column 1 contains "cat" and where any of col2:4 contain any of the following words: dog, fox or cow. For this example that will remove rows 1 and 4 from the original data.

Here's a sample dataset:

df <- data.frame(col1 = c("cat", "fox", "dog", "cat", "pig"),
                 col2 = c("lion", "tiger", "elephant", "dog", "cow"),
                 col3 = c("bird", "cow", "sheep", "fox", "dog"),
                 col4 = c("dog", "cat", "cat", "cow", "fox"))

I've tried a number of across variants but constantly run into issues. Here is my latest attempt:

filtered_df <- df %>%
  filter(!(animal1 == "cat" & !any(cowfoxdog <- across(animal2:animal4, ~ . %in% c("cow", "fox", "dog")))))

This returns the following error:

Error in `filter()`:
! Problem while computing `..1 = !...`.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric variables
like image 917
TheGoat Avatar asked Dec 08 '25 06:12

TheGoat


1 Answers

You can use if_any(). For a more robust test, I first added a row where col1 == "cat" but "dog", "fox", or "cow" don't appear in columns 2-4.

library(dplyr)

df <- df %>% 
  add_row(col1 = "cat", col2 = "sheep", col3 = "lion", col4 = "tiger")

df %>% 
  filter(!(col1 == "cat" & if_any(col2:col4, \(x) x %in% c("dog", "fox", "cow"))))
  col1     col2  col3  col4
1  fox    tiger   cow   cat
2  dog elephant sheep   cat
3  pig      cow   dog   fox
4  cat    sheep  lion tiger
like image 121
zephryl Avatar answered Dec 09 '25 20:12

zephryl