Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML inside XML CDATA being converted with < and > brackets

I have some sample XML:

<sample><![CDATA[Line 1<br />Line 2<br />Line 3<br />]]></sample>

I'm using ASP to output this XML using a stylesheet like so:

Set xmlHttp = Server.CreateObject("Microsoft.XMLHTTP")
xmlHttp.open "GET", URLxml, false
xmlHttp.send()

Set xslHttp = Server.CreateObject("Microsoft.XMLHTTP")
xslHttp.open "GET", xXsl, false
xslHttp.send()   

Set xmlDoc = Server.CreateObject("MICROSOFT.XMLDOM")
Set xslDoc = Server.CreateObject("MICROSOFT.XMLDOM")
xmlDoc.async = false
xslDoc.async = false
xmlDoc.Load xmlHttp.responseXML
xslDoc.Load xslHttp.responseXML

Response.Write xmlDoc.transformNode(xslDoc)

However, once this is getting written, the HTML output is showing up as:

Line 1&lt;br /&gt;Line 2&lt;br /&gt;Line 3

I can see that ASP is converting the brackets in the code, but I'm not sure why. Any thoughts?

like image 409
Steve K. Avatar asked Sep 16 '25 10:09

Steve K.


2 Answers

I have some sample XML:

<sample><![CDATA[Line 1<br />Line 2<br />Line 3<br />]]></sample>

This is a sample element with a text node child.

Suppose you apply an identity transform. Then the result will be:

<sample>Line 1&lt;br /&gt;Line 2&lt;br /&gt;Line 3&lt;br /&gt;</sample>

Why? Because text nodes and attribute values have the special character &, < and > escape as character entities.

EDIT: Of course, you could use DOE... But, besides that it's an optional feature, the result will be a text node no matter what (without the encode character entities). You will need other parser fase (this may be useful when output and encode HTML fragment to a (X)HTML document like in feeds, with the risk of malformed output...).

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="sample">
        <p>
            <xsl:value-of select="." disable-output-escaping="yes"/>
        </p>
    </xsl:template>
</xsl:stylesheet>

Output:

<p>Line 1<br />Line 2<br />Line 3<br /></p>

Render as (actual markup):

Line 1
Line 2
Line 3

In addition to @Alejandro's explanation, here is the best possible solution:

Never put markup in a text (CDATA) node.

Instead of:

<sample><![CDATA[Line 1<br />Line 2<br />Line 3<br />]]></sample>

always create:

<sample>Line 1<br />Line 2<br />Line 3<br /></sample>

Remember: Putting markup inside of CDATA is losing it.

like image 39
Dimitre Novatchev Avatar answered Sep 18 '25 22:09

Dimitre Novatchev