Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML serialization in JSON without excessive escaping

How to avoid solidus and double quote escaping of XML in JSON?

Given that

  1. solidus characters (aka forward slash, /) may, but need not, be escaped in JSON, and that
  2. XML attributes may use ' rather than " to avoid escaping in JSON string values,

what's the best way to realize these potential serialization improvements in XSLT?


This XML,

<?xml version="1.0" encoding="UTF-8"?>
<map xmlns="http://www.w3.org/2005/xpath-functions">
  <array key="o_array">
    <map>
      <string key="s/1">x/y/z</string>
    </map>
    <map>
      <string key="s2"><![CDATA[<a href="/x/y">Link</a> a/b "test"]]></string>
    </map>
  </array>
</map>

input to this XSLT,

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
  <xsl:output method="text"/>  
  <xsl:template match="/">
    <xsl:value-of select="xml-to-json(.,map{'indent':true()})"/>
  </xsl:template>
</xsl:stylesheet>

yields (via Saxon, XSLT Fiddle demo) this JSON output:

{ "o_array" : 
  [ 
    { "s\/1" : "x\/y\/z" },

    { "s2" : "<a href=\"\/x\/y\">Link<\/a> a\/b \"test\"" } ] }

For purposes of aesthetics (above JSON is unnecessarily ugly) and minimizing file size (after also disabling indentation), I would like to be generating the following JSON instead:

{ "o_array" : 
  [ 
    { "s/1" : "x/y/z" },

    { "s2" : "<a href='/x/y'>Link</a> a/b \"test\"" } ] }

Notes:

  • Single quotes: A Saxon-specific serialization option, saxon:single-quotes, seems tantalizing close to helping, but how to use this option with xml-to-json() is unclear to me.
  • Solidus: An XSLT serialization option, map{'method': 'json', 'use-character-maps': map{ '/': '/' }} as described by Martin Honnen, seems tantalizing close to helping, but, again, how to use this option with xml-to-json() escapes (ha) me.
  • string/@escape and string/@escape-key attributes, per my reading of the spec and confirmed via experimentation, cannot help here.
like image 801
kjhughes Avatar asked Aug 31 '25 10:08

kjhughes


1 Answers

The linked suggestion with a character map can only be used if you are willing to introduce a parse-json() => serialize(...) step:

. => xml-to-json() => parse-json() => serialize(map { 'method' : 'json', 'use-character-maps' : map { '/' : '/' } })

That way, with

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="3.0">

  <xsl:output method="text"/>

  <xsl:template match="/">
      <xsl:value-of select=". => xml-to-json() => parse-json() => serialize(map { 'method' : 'json', 'use-character-maps' : map { '/' : '/' } })"/>
  </xsl:template>

</xsl:stylesheet>

at https://xsltfiddle.liberty-development.net/b4GWVd/25 I get

{"o_array":[{"s/1":"x/y/z"},{"s2":"<a href=\"/x/y\">Link</a> a/b \"test\""}]}

To insert the Saxon specific serialization parameter on string values that are XML fragments I think you could try to run the input first through a mode that simply does another parsing and serialization step, only this time as

. => parse-xml-fragment() => serialize(map {
                        'method': 'xml',
                        QName('http://saxon.sf.net/', 'single-quotes'): true()
                    })

With Saxon 9.9 EE in oXygen and

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">

    <xsl:output method="text"/>

    <xsl:template match="/">
        <xsl:value-of
            select="
                $single-quotes => xml-to-json() => parse-json() => serialize(map {
                    'method': 'json',
                    'use-character-maps': map {'/': '/'}
                })"
        />
    </xsl:template>

    <xsl:variable name="single-quotes">
        <xsl:apply-templates mode="serialize-fragments"/>
    </xsl:variable>

    <xsl:mode name="serialize-fragments" on-no-match="shallow-copy"/>

    <xsl:template match="string" mode="serialize-fragments"
        xpath-default-namespace="http://www.w3.org/2005/xpath-functions">
        <xsl:copy>
            <xsl:apply-templates select="@*" mode="#current"/>
            <xsl:try
                select="
                    . => parse-xml-fragment() => serialize(map {
                        'method': 'xml',
                        QName('http://saxon.sf.net/', 'single-quotes'): true()
                    })">
                <xsl:catch select="string()"/>
            </xsl:try>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

I get

{"o_array":[{"s/1":"x/y/z"},{"s2":"<a href='/x/y'>Link</a> a/b \"test\""}]}
like image 182
Martin Honnen Avatar answered Sep 03 '25 01:09

Martin Honnen