Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the appropriate way to embed potentially non-well-formed html in an xml document?

The marketing people want to have the ability to write direct inline HTML in the (xml based) CMS. xhtml compliance and the like potentially goes down the drain, but they're the boss(es). The CMS uses a regular xml/xslt transformation pipeline. Currently we just use a single node with a cdata node containing all the nastiness, created using some nasty concatenations.

Any other ways to do this ?

Edit: I may be able to convince them that the HTML should be a well formed HTML fragment of some sort, but I cannot in the known universe get them to agree upon xhtml/strict compliance like the rest of the stuff actually is. But from what I understand, well formed simply doesn't help me anything ?

like image 324
krosenvold Avatar asked Nov 25 '25 10:11

krosenvold


1 Answers

CDATA is the only way to do this, there is simply no way invalid markup will go in an XML doc in any parsed structure.

May I suggest an alternative solution though? Fix the problem markup as it's inserted into the XML - definitely not trivial, but frankly the task they're giving you is absurd.

Check out HTML Tidy or Beautiful Soup which can take tag soup and turn it in to valid, well formed xhml.

like image 168
annakata Avatar answered Nov 28 '25 03:11

annakata



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!