Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The significance of space in this JS regexp?

I've been learning some Javascript regular expressions today and I'm failing to understand how the following code works.

var toswop = 'last first\nlast first\nlast first';
var swapped = text.replace(/([\w]+)\b([\w ]+)/g,'$2 $1');
alert(swapped);

It correctly alerts the words swapped round in to the correct sequence however the following code (note the missing space after the second \w) doesn't work. It just print them in the original order.

var toswop = 'last first\nlast first\nlast first';
var swapped = text.replace(/([\w]+)\b([\w]+)/g,'$2 $1');
alert(swapped);
like image 326
Samir Avatar asked Dec 30 '25 17:12

Samir


2 Answers

From the MDN:

\w

Matches any alphanumeric character including the underscore. Equivalent to [A-Za-z0-9_].

For example, /\w/ matches 'a' in "apple," '5' in "$5.28," and '3' in "3D."

When you add a space, you change the character set from alphanumerics and an underscore to alphanumerics and an underscore and a space.

like image 176
Blender Avatar answered Jan 01 '26 06:01

Blender


I think you are incorrectly using '\b' to match with a space, but in JavaScript regular expressions '\b' matches with a beginning or end of word.

Therefore this /([\w]+)\b/ part of the regular expression match only upto the end of word 'last'. remaining string is ' first' (note the space at the beginning).

Then to match with the remainder you need this ([\w ]+), this translates into 'One or more occurances of anyword character or space character'. which is exactly what we need to match with the remainder string ' first'.

You can note that even when the words are swapped, there is a space before the word 'first'.

To prove this further: imagine you changed your input to :

var toswop = 'last first another\nlast first another\nlast first another';

You can see your swapped text becomes

 first another last
 first another last
 first another last 

That is because last segment of the regular expression ([\w ]+) kept matching with both spaces and word characters and included the word 'another' into the match.

But if you remove the space from square brackets, then it won't match with the remainder ' first', because its not a string of 'word character' but a 'space' + string of 'word character'.

That is why you space is significant here.

But if you change your regex like this:

swapped = toswop.replace(/([\w]+)\s([\w]+)/g,'$2 $1');

Then it works without the space because \s in the middle with match with the space in the middle of two words.

Hope this clarifies your question.

See here for JavaScript RegEx syntax: http://www.w3schools.com/jsref/jsref_regexp_begin.asp

See here for my fiddle if you want to experiment more: http://jsfiddle.net/BuddhiP/P5Jqm/

like image 28
BuddhiP Avatar answered Jan 01 '26 06:01

BuddhiP



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!