Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML DocumentBuilder removes CDATA Section

Tags:

java

xml

cdata

I have webapp on weblogic , which
1.reads XML from database
2.parses it
3.adds new section
Source XML has CDATA sections

<?xml version="1.0" encoding="UTF-8" ?>     
    <script type="calcscript">
    <![CDATA[  some data ]]>
    </script>

When I parse xml

  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
  DocumentBuilder builder = factory.newDocumentBuilder();
  Document xml = builder.parse(new ByteArrayInputStream(bytes));

It removes CDATA section!
After back converting to string

Transformer transformer = TransformerFactory.newInstance().newTransformer();
StringWriter sw = new StringWriter();
Result output = new StreamResult(sw);
Source input = new DOMSource(xml);
transformer.transform(input, output);

I get XML like this

<?xml version="1.0" encoding="UTF-8" ?> 
<script type="calcscript">
some data
</script>

Why does it remove CDATA sections ? may be weblogic includes old java libs which does not support CDATA section.

P.S. when I run app on tomcat server or java application everythingh works fine

like image 666
rpc1 Avatar asked Apr 21 '26 03:04

rpc1


1 Answers

First of all, the parsing process does not remove the CDATA information. Look at some debug info:

Debug Variables

Second: It is the transformation process that gets rid of those CDATA sections because this is simply not defined in the spec (look at the answer from Michael Kay in this question).

You can however set some properties to the transformer that enables it to preserve those sections:

transformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "script");

Now you will have the CDATA section in the output.

like image 79
Seelenvirtuose Avatar answered Apr 22 '26 15:04

Seelenvirtuose



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!