I have a vector of strings and want to add a + before each word in each string.
strings <- c('string one', 'string two', 'string three')
strings_new <- str_replace_all(strings, "\\b\\w", '+')
string_new
Unfortunately, this is replacing the first character, not adding the + symbol. I'm not too familiar with regex to know how to solve this.
Any help would be great.
Thanks
Using captured groups is one way of doing this. Group with parenthesis and recall with \\1.
strings_new <- str_replace_all(strings, "(\\b\\w)", '+\\1')
strings_new
[1] "+string +one" "+string +two" "+string +three"
You may use a base R solution using PCRE regex [[:<:]] that matches the starting word boundary, a location between a non-word and a word char:
strings <- c('string one', 'string two', 'string three')
gsub("[[:<:]]", "+", strings, perl=TRUE)
# => [1] "+string +one" "+string +two" "+string +three"
Or, you may use a (\w+) (that matches and captures into Group 1 any one or more word chars, i.e. letters, digits, or _) TRE regex to replace with a + and a replacement backreference \1 to restore the consumed chars in the output:
gsub("(\\w+)", '+\\1', strings)
# => [1] "+string +one" "+string +two" "+string +three"
Note you do not need a word boundary here since the first word char matched will be already at the word boundary and the consequent word chars will be consumed due to + quantifier. See the regex demo.
And with an ICU regex based str_replace_all, you may use
> str_replace_all(strings, "\\w+", '+\\0')
[1] "+string +one" "+string +two" "+string +three"
The \\0 is a replacement backreference to the whole match.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With