How to remove everything contained after the first number of a string?
x <- c("Hubert 208 apt 1", "Mass Av 300, block 3")
After this question, I succeeded in removing everything before the first number, the first number inclusive:
gsub( "^\\D*\\d+", "", x )
[1] " apt 1" ", block 3"
But the desired output looks like this:
[1] "Hubert 208" "Mass Av 300"
>
In the OP's current code, a minor change can make it work i.e. to capture the matching pattern as a group ((...)) and replace with backreference (\\1)
sub("^(\\D*\\d+).*", "\\1", x)
#[1] "Hubert 208" "Mass Av 300"
Here, the pattern from OP implies ("^\\D*\\d+") - zero or more characters that are not a digit (\\D*) from the start (^) of the string, followed by one or more digits (\\d+) and this is captured as a group with parens ((...)).
Also, instead of gsub (global substitution) we need only sub as we need to match only a single instance (from the beginning)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With