Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use PHP preg_replace with the same pattern, when the word happens multiple times?

Sorry that my question is so horribly worded, but I have no idea how to state it as a question. It is easier for me to just show code and explain.

I am trying to write a function to allow for tagging of words. We have database of words we call glossary. I want to take a large amount of text and look for multiple instance of [G]some word/words here[/G]. Then I want to replace that with <a href="viewglossary.php?word={WORD/WORDS BETWEEN [G][/G]}">{WORD/WORDS BETWEEN [G][/G]}</a>

Here is my current function:

function getGlossary($str)
{
    $patterns = array();
    $patterns[]='/\[G\](.*)\[\/G\]/';
    $replacements = array();
    $replacements[]='<a href="viewglossary.php?word=$1">$1</a>';
    return preg_replace($patterns, $replacements, $str);
}
echo getGlossary($txt);

If I only do a single instance of the [G][/G] tag it works.

$txt='What you need to know about [G]beans[/G]';

This will output

What you need to know about <a href="viewglossary.php?word=beans">beans</a>

However this

$txt='What you need to know about [G]beans[/G] and [G]corn[/G]';

will output

What you need to know about <a href="viewglossary.php?word=beans[/G] and [G]corn">beans[/G] and [G]corn</a>

I am sure I have something wrong in my pattern. Any help would be appreciated.

like image 344
tvirelli Avatar asked Dec 05 '25 15:12

tvirelli


1 Answers

You need to make your dot-star lazy: .*?

  • Without the ? to keep the .* in check, the .* will eat up all characters up to the final [/G]
  • the * quantifier is greedy, so the .* starts off by matching all the characters in the string up to the very end. Then it backtracks only as far as needed to allow the [/G] to match (therefore, it only backtracks to the last [/G]).
  • the ? makes quantifiers "lazy", so that they only match as far as needed for the rest of the regex to match. Therefore it will only match up to the first [/G].

Modify your regex like so:

$pattern = "~\[G\](.*?)\[/G\]~";

Note that to make the regex easier to read, I have changed the delimiter and unescaped the forward slash, as there is no need to escapes slashes unless the delimiter is a slash. Common delimiters include ~, %, @, #... But really tildes are the most beautiful. :)

Reference

  • The Many Degrees of Regex Greed
  • Repetition with Star and Plus
like image 60
zx81 Avatar answered Dec 07 '25 05:12

zx81



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!