Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quick way to select rows with matching terms in a list in data frame

Tags:

list

dataframe

r

I have a data frame that has lists of lists that stores IDs:

a <- list(as.character(c(1, 2, 3)))
b <- list(as.character(c(2, 3, 5)))
c <- list(as.character(c(4, 6, 8)))
df <- data.frame(NAME = c("A1", "A2", "A3"), stat = c(14, 15, 16)) 
df$IDs[1] <- a      
df$IDs[2] <- b   
df$IDs[3] <- c

Additionally, I have a list of characters which is a reference of IDs of my interest that I want to track:

x <- list(as.character(c(2, 3)))

I would like to filter the initial data frame so that it will only contain the rows that have IDs of 2 and/or 3 in the ID column of the data frame (i.e., x matching to df$ID; thereby in this case only the rows named A1 and A2 in this case).

The actual data frame has hundreds of rows so I would appreciate a shorter route than a loop if possible.
If you have a different approach as part of your suggestions (like wrangling the initial df a bit more), I'd also appreciate hearing them as well.

Many thanks in advance.

like image 634
M. L Avatar asked Sep 18 '25 10:09

M. L


2 Answers

Using tidyverse

library(dplyr)
library(purrr)
df %>%
    filter(map_lgl(IDs, ~ any(unlist(x) %in% .x)))
  NAME stat     IDs
1   A1   14 1, 2, 3
2   A2   15 2, 3, 5
like image 51
akrun Avatar answered Sep 20 '25 01:09

akrun


You could use sapply or mapply:

df[sapply(df$IDs, \(id) any(x[[1]] %in% id)), ]
df[mapply(\(a, b) any(a %in% b), x, df$IDs), ]
Output
#   NAME stat     IDs
# 1   A1   14 1, 2, 3
# 2   A2   15 2, 3, 5
like image 45
Darren Tsai Avatar answered Sep 20 '25 01:09

Darren Tsai