Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java ReplaceAll vs For loop with Replace in it

I have a string which holds some data and I need to remove some special characters from it and tokenize data.

Which of the following two methods should be preferred for better performance:

String data = "Random data (For performance) Waiting for reply?"
data=data.replaceAll("?", "");
data=data.replaceAll(".", "");
data=data.replaceAll(",", "");
data=data.replaceAll("(", "");
data=data.replaceAll(")", "");  

String[] tokens = data.split("\\s+");  
for(int j = 0; j < tokens.length; j++){
  //Logic on tokens
}  

OR

String data = "Random data (For performance) Waiting for reply?"

String[] tokens = data.split("\\s+");  
for(int j = 0; j < tokens.length; j++){
    tokens[j]=tokens[j].replace("?", "");
    tokens[j]=tokens[j].replace(".", "");
    tokens[j]=tokens[j].replace(",", "");
    tokens[j]=tokens[j].replace("(", "");
    tokens[j]=tokens[j].replace(")", "");      

  //Logic on each token
}  

Or Is there any other approach which can increase performance? (Some statistics on same would be greatly appreciated)

The For loop provided above will be used for performing other logic on each token.
Is the replace method imposed on a whole content faster or is replace on each token in a for loop (which is executed regardless of the replacing) faster?

i.e. Replace once and perform other operations or Replace step by step for each token and then perform the required operation.

Thanks in Advance

like image 316
Abhishek Avatar asked Dec 31 '25 04:12

Abhishek


2 Answers

Just replace would be enough without any loops.

replaceAll uses regexp engine under the hood that has much more performance overhead.

There seems to be a common misunderstanding of this "All" suffix.

See Difference between String replace() and replaceAll().

Update

Found very similar question to this one:

Removing certain characters from a string

like image 85
Vadzim Avatar answered Jan 01 '26 16:01

Vadzim


I am not aware of statistics for this kind of problem, but first of all, if you are concerned about performance, I would substitute the various replaceAll() calls with a single one, like this:

data=data.replaceAll("\\?|\.|\\)|\\(|,", "");

It might go faster.

like image 32
rlinden Avatar answered Jan 01 '26 16:01

rlinden



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!