Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove at once text/character from both sides of a given character/text (#regex)?

Tags:

regex

r

stringr

What is the simplest way of removing text on both left and right side of a given character/text in r?

I have an example of the following dataset: a = c("C:\\final docs with data/Gakenke_New_Sanitation.xlsx", "C:\\final docs with data/Gatsibo_New_Sanitation.xlsx", "C:\\final docs with data/Rutsiro_New_Sanitation.xlsx")

My expected output is to remain with: Gakenke, Gatsibo and Rutsiro.

I know, I can breakdown this task and handle it using mutate() as the following:

a %>% mutate(a = str_remove(a, "C.+/"), a = str_remove(a,"_.+")).

My question now is which simple pattern can I pass to that mutate function to remain with my intended results: Gakenke, Gatsibo and Rutsiro.

Any help is much appreciated. thank you!

like image 878
Birasafab Avatar asked Dec 02 '25 22:12

Birasafab


2 Answers

You can use

a = c("C:\\final docs with data/Gakenke_New_Sanitation.xlsx", "C:\\final docs with data/Gatsibo_New_Sanitation.xlsx",  "C:\\final docs with data/Rutsiro_New_Sanitation.xlsx")
library(stringr)
str_remove_all(a, "^.*/|_.*")
## => [1] "Gakenke" "Gatsibo" "Rutsiro"

The stringr::str_remove_all removes all occurrences of the found pattern. ^.*/|_.* matches a string from the start till the last / and then from the _ till end of the string (note the string is assumed to have no line break chars).

like image 95
Wiktor Stribiżew Avatar answered Dec 04 '25 12:12

Wiktor Stribiżew


A possible solution, based on stringr::str_extract and lookaround:

library(tidyverse)

a %>% 
  str_extract("(?<=data\\/).*(?=\\_New)")

#> [1] "Gakenke" "Gatsibo" "Rutsiro"
like image 43
PaulS Avatar answered Dec 04 '25 10:12

PaulS



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!