I need something a bit along the lines of CTRL + F in Microsoft Excel to look for a string in a whole dataframe (I prefer a dplyr solution if possible).
I modified my reprex based on the suggestions by Ronak and Akrun. They both are excellent, one relying on base R and the other on str_detect. I personally prefer the latter only because it is better performing on large datasets on my machine. Thank you both!
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
##Two functions suggested by Ronak
find_text <- function(df, tt, ...){
res <- df %>%
mutate(across(where(is.character), ~grepl(tt,.x, ...)))
return(res)
}
find_text_filter <- function(df, tt, ...){
res <- df %>%
filter(if_any(where(is.character), ~grepl(tt,.x, ...)))
return(res)
}
### And now the str_detect variation by Akrun
find_text2 <- function(df, tt){
res <- df %>%
mutate(across(where(is.character), ~str_detect(.x,tt)))
return(res)
}
find_text_filter2 <- function(df, tt){
res <- df %>%
filter(if_any(where(is.character), ~str_detect(.x,tt)))
return(res)
}
df <- tibble(a=seq(5), b=c("hfh", "gjgkjguk", "jyfyujyuj ygujyg", "uyyhjg",
"776uj"),
d=c("ggg", "hhh", "gfrr", "67hn", "jnug"),
e=c("gtdfdc", " kjihi", "hgwjhfg", "ujyggg", "ut 089jhjm") )
df1 <- df %>%
find_text("gj")
df1 ## this works: I know in which text column and where the text appears
#> # A tibble: 5 x 4
#> a b d e
#> <int> <lgl> <lgl> <lgl>
#> 1 1 FALSE FALSE FALSE
#> 2 2 TRUE FALSE FALSE
#> 3 3 FALSE FALSE FALSE
#> 4 4 FALSE FALSE FALSE
#> 5 5 FALSE FALSE FALSE
## and now this also does
df2 <- df %>%
find_text_filter("gj")
df2
#> # A tibble: 1 x 4
#> a b d e
#> <int> <chr> <chr> <chr>
#> 1 2 gjgkjguk hhh " kjihi"
### same with the str_detect functions
df3 <- df %>%
find_text2("gj")
df3
#> # A tibble: 5 x 4
#> a b d e
#> <int> <lgl> <lgl> <lgl>
#> 1 1 FALSE FALSE FALSE
#> 2 2 TRUE FALSE FALSE
#> 3 3 FALSE FALSE FALSE
#> 4 4 FALSE FALSE FALSE
#> 5 5 FALSE FALSE FALSE
df4 <- df %>%
find_text_filter2("gj")
df4
#> # A tibble: 1 x 4
#> a b d e
#> <int> <chr> <chr> <chr>
#> 1 2 gjgkjguk hhh " kjihi"
Created on 2021-05-20 by the reprex package (v2.0.0)
We could use str_detect
library(dplyr)
library(stringr)
find_text_filter <- function(df, tt){
df %>%
filter(if_any(where(is.character), ~str_detect(.x, tt)))
}
-testing
df %>%
find_text_filter("gj")
# A tibble: 1 x 4
# a b d e
# <int> <chr> <chr> <chr>
#1 2 gjgkjguk hhh " kjihi"
You may make use of if_any here :
library(dplyr)
find_text_filter <- function(df, tt, ...){
res <- df %>%
filter(if_any(where(is.character), ~grepl(tt,.x, ...)))
return(res)
}
df %>% find_text_filter("gj")
# A tibble: 1 x 4
# a b d e
# <int> <chr> <chr> <chr>
#1 2 gjgkjguk hhh " kjihi"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With