Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CDATA Section Node deprecated in DOM4

I have been doing some reading on the DOM documents and it seems that in the new standard the node type for CDATA sections is now gone.

Seems that mozilla got rid of CDATA_SECTION_NODE since it is now deprecated. In the DOM document now it says that its historical. My question is: If now the function nodeType does not detect the CDATA_SECTION_NODE how does the DOM deal with those tags? That is if I were to write

<script><![CDATA[ /*Some code with < & and what not */ ]]></script>

then how will the browsers deal with this if there is no node to handle the CDATA sections? Does it simply read the contents and ignores the <!CDATA[ and ]]> strings?

Furthermore, is there anywhere that explains the decision to get rid of it?

like image 355
jmlopez Avatar asked Mar 03 '26 05:03

jmlopez


1 Answers

There are a few distinct components to CDATA handling:

  1. Whether CDATA is a distinct node type in the DOM (a CDATASection interface with a Node.CDATA_SECTION_NODE nodeType) or whether it's just a Text node.
  2. How the (HTML or XML) parser handles the markup containing <!CDATA[ ... ]]> -- how does it treat the special characters (<>&) inside CDATA, does it emit a single node for the CDATA section, and if so does it create a Text node or a CDATASection node in the DOM.
  3. How the CDATA sections are serialized (wrapped in <![CDATA[ ]]> or with special characters escaped)

As far as I can tell for #2 the HTML5 parser spec is implemented in most browsers and according to it, the parser never emits CDATASection nodes and depending on the context

  • either parses the content of the CDATA as "raw text" (with different handling of special characters) - e.g. inside the <script> or <math>
  • or treats the CDATA as a "bogus comment", which ends at the first >.

The question of how and if CDATA should be exposed in the DOM is not agreed upon and despite being removed from the DOM4 spec, it is still available in at least Gecko (see Mozilla bug 660660, W3C bugs 12841 and 27386).

  • On one hand, from the point of view of an application working with the DOM the CDATA node is not much different from a text node -- the only difference is their serialization in the markup. So if CDATA is exposed as a separate node type, everyone needs to remember to check for it any time they want to check for a text node.
  • On the other hand, losing the ability to serialize certain sections of the document as CDATA upsets the developers and users of authoring tools, because it forces the serialization to always encode characters special to XML/HTML.
like image 113
Nickolay Avatar answered Mar 05 '26 18:03

Nickolay



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!