In PHP I need to highlight multiple given words in a string, for example wrapping found matches inside a <em> tag.
But if I have a word ending in + I cannot do it.
I understand the below problem is that plus is not a word and breaks that \b flag word match. But how can I write this so that it matches and wrapps all given words even if a given word ends in + ?
$my_text = 'test c+ and javascript etc but NOT javascripter';
$words_to_highlight = array('javascript', 'c+');
foreach($words_to_highlight as $word){
    
    $search_pattern = str_replace('+', '\\+', $word);
    
    // this doesn't match replacement
    echo "\n".preg_replace("/\b(".$search_pattern.")\b/i", '<em>$1</em>', $my_text);
    
    // works if I remove the \b flag, but I don't want to match "javascript" inside "javascripter"
    echo "\n".preg_replace("/(".$search_pattern.")/i", '<em>$1</em>', $my_text);
    
}
Output is:
test c+ and <em>javascript</em> etc but NOT javascripter
test c+ and <em>javascript</em> etc but NOT <em>javascript</em>er
test c+ and javascript etc but NOT javascripter
test <em>c+</em> and javascript etc but NOT javascripter
What I want to result is:
test <em>c+</em> and <em>javascript</em> etc but NOT javascripter
Instead of using word boundaries, you can make use of whitspace boundaries in the form of lookarounds asserting not a non whitspace character to the left (?<!\S) and the right (?!\S)
For escaping characters which are part of the regex syntax, you can use preg_quote.
To to the replacement with a single pattern that matches all the words, you can dynamically create the regex with a non capture group listing all the alternatives separated by a pipe char |
The final pattern would look like this:
(?<!\S)(?:javascript|c\+)(?!\S)
See the matches in a regex demo and a PHP demo.
As you are not matching other text, you don't need a capture group and you can use the full match in the replacement denoted by $0
For example:
$my_text = 'test c+ and / javascript etc but NOT javascripter';
$words_to_highlight = array('javascript', 'c+');
$pattern = sprintf(
    "/(?<!\S)(?:%s)(?!\S)/i",
    implode('|',
        array_map(function ($s) {
            return preg_quote($s, '/');
        }, $words_to_highlight)));
echo preg_replace($pattern, '<em>$0</em>', $my_text);
Output
test <em>c+</em> and <em>javascript</em> etc but NOT javascripter
If you are using PHP 7.4 or higher you can make use of an arrow function for array_map
 array_map(fn($s) => preg_quote($s, '/'), $words_to_highlight)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With