a <- c("this is a number 9999333333 and i got 12344")
How could i replace the number greater than 5 digits with the extra digits being "X"
Expected Output:
"this is a number 99993XXXXX and i got 12344"
Code i tried:
gsub("(.{5}).*", "X", a)
To replace all numbers in a string, call the replace() method, passing it a regular expression that globally matches all numbers as the first parameter and the replacement string as the second. The replace method will return a new string with all matches replaced by the provided replacement.
#include <iostream> using namespace std; int replaceDig( int num, int oldDigit, int newDigit) { if(num==0)return 0; int digit = num%10; if(digit==oldDigit)digit = newDigit; return replaceDig(num/10,oldDigit,newDigit)*10+digit; } int main() { int num, newnum, oldDigit, newDigit; cout << "Enter the number: " << endl; cin ...
The correct RegEx for selecting all numbers would be just [0-9] , you can skip the + , since you use replaceAll . However, your usage of replaceAll is wrong, it's defined as follows: replaceAll(String regex, String replacement) . The correct code in your example would be: replaceAll("[0-9]", "") .
The REGEXREPLACE( ) function uses a regular expression to find matching patterns in data, and replaces any matching values with a new string. standardizes spacing in character data by replacing one or more spaces between text characters with a single space.
You can use gsub with a PCRE regex:
(?:\G(?!^)|(?<!\d)\d{5})\K\d
See the regex demo. Details:
(?:\G(?!^)|(?<!\d)\d{5}) - the end of the previous successful match (\G(?!^)) or (|) a location not preceded with a digit ((?<!\d)) and then any five digits\K - match reset operator discarding all text matched so far\d - a digit.See the R demo:
a <- c("this is a number 9999333333 and i got 12344")
gsub("(?:\\G(?!^)|(?<!\\d)\\d{5})\\K\\d", "X", a, perl=TRUE)
## => [1] "this is a number 99993XXXXX and i got 12344"
gsubfn in the gsubfn package is like gsub except the replacement string can be a function which inputs the capture groups and outputs a replacement to the match. The function can optionally be expressed in a formula notation as we do here.
The regular expression (\d{5}) matches and captures 5 digits and (\d+) matches and captures the remaining digits. The two capture groups are fed into the function and are pasted back together except each character in the second is replaced with X. r"{...}" is the notation for string literals introduced in R 4.0 which eliminates having to use double backslashes to denote a backslash within a string literal.
library(gsubfn)
gsubfn(r"{(\d{5})(\d+)}", ~ paste0(x, gsub(".", "X", y)), a)
## [1] "this is a number 99993XXXXX and i got 12344"
If we replace the first argument with the regular expression r"{(\d{2})(\d{4,})}" then it will replace all but the first two digits provided there are at least 6 digits.
An alternative way, not using gsub to replace numbers greater than 5 digits in a text is to split the string with strsplit, test if there are only digits and combine a substr and a strrep:
paste(lapply(strsplit(a, " ")[[1]], function(x) {
if(!grepl("\\D", x)) {
paste0(substr(x, 1, 5), strrep("X", pmax(0, nchar(x)-5)))
} else {x}}), collapse = " ")
#[1] "this is a number 99993XXXXX and i got 12344"
To replace X after first 2 digits for numbers greater than 5 digits:
paste(lapply(strsplit(a, " ")[[1]], function(x) {
if(!grepl("\\D", x) & nchar(x) > 5) {
paste0(substr(x, 1, 2), strrep("X", pmax(0, nchar(x)-2)))
} else {x}}), collapse = " ")
#[1] "this is a number 99XXXXXXXX and i got 12344"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With