I have a vector a as follows:
a <- c("Rs. 360 Rs. 540 [-33% ]", "Rs. 213 Rs. 250 [-15% ]", "Rs. 430 Rs. 1030 [-58% ]")
Need answer as below:
a should have Rs.360, Rs.213, Rs.430
I have used:
a <- gsub(" Rs*", "", a)
As I said in comment, you can use substr to extract the beginning of the string, if you always have the same pattern (same number of digits). You can further suppress the space if you want to:
substr(a, 1, 7)
[1] "Rs. 360" "Rs. 213" "Rs. 430"
sub(" ", "", substr(a, 1, 7))
[1] "Rs.360" "Rs.213" "Rs.430"
Or you can capture the pattern you want in the string and form another string with just that:
gsub("^[A-Za-z.]{3} (\\d{3}).+", "Rs.\\1", a)
[1] "Rs.360" "Rs.213" "Rs.430"
Here you're capturing only the 3 digits and putting explicitly back the Rs..
Or you can "erase" everything that you don't want : the space and everything that comes after the pattern you want to keep:
gsub("(\\s)|([A-Za-z0-9. ]{8}\\s\\[-*\\d+%\\s*\\])", "", a)
[1] "Rs.360" "Rs.213" "Rs.430"
Here, you specify you want to suppress the space (\\s) and/or 8 characters that are either alphanumeric or a dot or a space, followed by a space, an opening bracket, nothing or a minus sign, more than one digit(s), a % sign, nothing or a space and finally a closing bracket.
You may use a regex with capturing groups that will grab the parts you need and using backreferences in the replacement pattern you may insert them back into the result:
sub("^\\s*(Rs\\.)\\s*(\\d+).*", "\\1\\2", a)
See the regex demo
The regex matches:
^ - start of string\\s* - zero or more whitespaces(Rs\\.) - Group 1 capturing Rs. sequence\\s* - 0+ whitespaces(\\d+) - Group 2 caprturing 1 or more digits.* - the rest of the string to its end Tested code:
> a <- c("Rs. 360 Rs. 540 [-33% ]", "Rs. 213 Rs. 250 [-15% ]", "Rs. 430 Rs. 1030 [-58% ]")
> sub("^\\s*(Rs\\.)\\s*(\\d+).*", "\\1\\2", a)
[1] "Rs.360" "Rs.213" "Rs.430"
Update
For an input like a <- c(" 360 540", " 213 250"), use sub("^\\D*(\\d+).*", "\\1", a).
> a <- c(" 360 540", " 213 250")
> sub("^\\D*(\\d+).*", "\\1", a)
[1] "360" "213"
The ^\\D*(\\d+).* matches any amount of non-digit symbols at the start of the string, then captures 1+ digits into Group 1, and then .* matches the rest of the string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With