Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use variables in pattern in Regex (C#)

Tags:

c#

regex

I have some HTML-text, where I need to replace words to links on them. For example, I have text with word "PHP", and want to replace it with <a href="glossary.html#php">PHP</a>. And there are many words that I need to replace.

My code:

public struct GlossaryReplace
{
    public string word; // here the words, e.g. PHP
    public string link; // here the links to replace, e.g. glossary.html#php
}
public static GlossaryReplace[] Replaces = null;    

IHTMLDocument2 html_doc = webBrowser1.Document.DomDocument as IHTMLDocument2;
string html_content = html_doc.body.outerHTML;

for (int i = 0; i < Replaces.Length; i++)
{
    String substitution = "<a class=\"glossary\" href=\"" + Replaces[i].link + "\">" + Replaces[i].word + "</a>";
    html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + "\b", substitution);
}
html_doc.body.innerHTML = html_content;

The trouble is - this is not working :( But,

html_content = Regex.Replace(html_content, @"\bPHP\b", "some replacement");

this code works well! I can't understand my error!

like image 823
Vdm17 Avatar asked Oct 27 '25 21:10

Vdm17


1 Answers

The @ prefix for strings only apply to the immediately following string, so when you concatenate strings you may have to use it on each string.

Change this:

html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + "\b", substitution);

to:

html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + @"\b", substitution);

In a regular expression \b means a word boundary, but in a string it means a backspace character (ASCII 8). You get a compiler error if you use an escape code that doesn't exist in a string (e.g. \s), but not in this case as the code exist both in strings and regular expressions.

On a side note; a method that is useful when creating regular expression patterns dynamically is the Regex.Escape method. It escapes characters in a string to be used in a pattern, so @"\b" + Regex.Escape(Replaces[i].word) + @"\b" would make the pattern work even if the word contains characters that have a special meaning in a regular expression.

like image 189
Guffa Avatar answered Oct 30 '25 13:10

Guffa



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!