Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R use str_extract (stringr) to export a string between "_"

Tags:

r

extract

stringr

I have some strings in a vector like:

x <- c("ROH_Pete_NA_1_2017.zip",
   "ROH_Annette_SA_2_2016.zip",
   "ROH_Steve_MF_4_2015.zip")

I need to extract the names out of this strings (Pete, Annette, Steve) I would like to do this, in a loop and with str_extract()

all Strings starts with ROH_ but the length of the names are different and also the strings behind.

I would like to use str_extract() but I'm also happy for other solutions

Thank you for your help.

like image 288
7660 Avatar asked Nov 19 '25 07:11

7660


2 Answers

Here is a solution with str_extract:

library(stringr)
str_extract(x, "(?<=_).+?(?=_)")
# [1] "Pete"    "Annette" "Steve"  

You can also use gsub in base R:

gsub("^.+?_|_.+$", "", x)
# [1] "Pete"    "Annette" "Steve"  
like image 161
Sven Hohenstein Avatar answered Nov 20 '25 22:11

Sven Hohenstein


You are probably better off with str_match, as this allows capture groups. So you can add the _ either side for context but only return the bit you are interested in. The (\\w+?) is the capture group, and str_match returns this as the second column, hence the [,2] (the first column is what str_extract would return).

library(stringr)
str_match(x,"ROH_(\\w+?)_")[,2]

[1] "Pete"    "Annette" "Steve" 
like image 31
Andrew Gustar Avatar answered Nov 20 '25 21:11

Andrew Gustar