Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a white-space between number and special character condition R

I'm trying to use stringr or R base calls to conditionally add a white-space for instances in a large vector where there is a numeric value then a special character - in this case a $ sign without a space. str_pad doesn't appear to allow for a reference vectors.

For example, for:

$6.88$7.34

I'd like to add a whitespace after the last number and before the next dollar sign:

$6.88 $7.34

Thanks!

like image 242
js80 Avatar asked Sep 13 '25 17:09

js80


2 Answers

If there is only one instance, then use sub to capture digit and the $ separately and in the replacement add the space between the backreferences of the captured group

sub("([0-9])([$])", "\\1 \\2", v1)
#[1] "$6.88 $7.34"

Or with a regex lookaround

gsub("(?<=[0-9])(?=[$])", " ", v1, perl = TRUE)

data

v1 <- "$6.88$7.34"
like image 105
akrun Avatar answered Sep 16 '25 09:09

akrun


This will work if you are working with a vectored string:

mystring<-as.vector('$6.88$7.34 $8.34$4.31')

gsub("(?<=\\d)\\$", " $", mystring, perl=T)

[1] "$6.88 $7.34 $8.34 $4.31"

This includes cases where there is already space as well.

Regarding the question asked in the comments:

mystring2<-as.vector('Regular_Distribution_Type† Income Only" "Distribution_Rate 5.34%" "Distribution_Amount $0.0295" "Distribution_Frequency Monthly')

gsub("(?<=[[:alpha:]])\\s(?=[[:alpha:]]+)", "_", mystring2, perl=T)

[1] "Regular_Distribution_Type<U+2020> Income_Only\" \"Distribution_Rate 5.34%\" \"Distribution_Amount $0.0295\" \"Distribution_Frequency_Monthly"

Note that the \ appears due to nested quotes in the vector, should not make a difference. Also <U+2020> appears due to encoding the special character.

Explanation of regex:

(?<=[[:alpha:]]) This first part is a positive look-behind created by ?<=, this basically looks behind anything we are trying to match to make sure what we define in the look behind is there. In this case we are looking for [[:alpha:]] which matches a alphabetic character.

We then check for a blank space with \s, in R we have to use a double escape so \\s, this is what we are trying to match.

Finally we use (?=[[:alpha:]]+), which is a positive look-ahead defined by ?= that checks to make sure our match is followed by another letter as explained above.

The logic is to find a blank space between letters, and match the space, which then is replaced by gsub, with a _

See all the regex here

like image 22
Chabo Avatar answered Sep 16 '25 07:09

Chabo