Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clean font-size tag with Regex

Tags:

c#

.net

regex

I try to use Regex to clean <font style="font-size:85%;font-family:arial,sans-serif"> from

font-size:85%;

My Regex is ^font-size:(*);

I mean I have to delete font-size tag completly.

Can someone help me pls?

Thank you!

like image 325
Terminador Avatar asked Jan 23 '26 20:01

Terminador


2 Answers

Several things with your current regex will cause it to fail:

^font-size:(*);

You are anchoring to the start of the line ^ - the attribute is not at the start of the line.

* on its own means nothing.

Change it to:

font-size: ?\d{1,2}%;
like image 109
Oded Avatar answered Jan 26 '26 11:01

Oded


This is the regex you will need:

string html = @"<font style=""font-size:85%;font-family:arial,sans-serif"">";
string pattern = @"font-size\s*?:.*?(;|(?=""|'|;))";
string cleanedHtml = Regex.Replace(html, pattern, string.Empty);

This regex will work even if the font-size is defined in pt or em, or if there is a different set of CSS styles defined (ie. font-family not specified). You can see the results here.

The explanation of the regex follows:

// font-size\s*?:.*?(;|(?="|'|;))
// 
// Match the characters “font-size” literally «font-size»
// Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the character “:” literally «:»
// Match any single character that is not a line break character «.*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the regular expression below and capture its match into backreference number 1 «(;|(?="|'|;))»
//    Match either the regular expression below (attempting the next alternative only if this one fails) «;»
//       Match the character “;” literally «;»
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) «(?="|'|;)»
//       Assert that the regex below can be matched, starting at this position (positive lookahead) «(?="|'|;)»
//          Match either the regular expression below (attempting the next alternative only if this one fails) «"»
//             Match the character “"” literally «"»
//          Or match regular expression number 2 below (attempting the next alternative only if this one fails) «'»
//             Match the character “'” literally «'»
//          Or match regular expression number 3 below (the entire group fails if this one fails to match) «;»
//             Match the character “;” literally «;»
like image 27
Nikola Malešević Avatar answered Jan 26 '26 12:01

Nikola Malešević