I'm looking for the best way to do some sort of "smart" HTML encoding. For instance:
From: <a>Next >></a> to: <a>Next gt;gt;</a>
From: <p><a><b><< Prev</b></a><br/><a>Next >></a></p> to: <p><a><b><< Prev</b></a><br/><a>Next gt;gt;</a></p>
So only the non XML / HTML part of the text would be encoded as if HtmlEncode is called.
Any suggestions?
EDIT: This should be as lightweight as possible. The incoming text will come from users which have no knowledge of HTML encoding.
Yes: don’t ever write HTML into your source code. Instead work with an API like DOM that takes care of all encoding issues for you.
If you want a solid and totally reliable C# solution (but heavy-weight) then I'd use the HTML Agility Pack library. You could then iterate through nodes and HTML encode the contents. It's a bit more bullet-proof than regular expressions, but obviously more intense.
If you want to do it client-side, then use JQuery. See Encode HTML entities with jQuery.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With