I have the following XML:
<root>
<child value="ÿï™à"/>
</root>
When I do a transform I want the character hex code values to be preserved. So if my transform was just a simple xsl:copy and the input was the above XML, then the output should be identical to the input.
I have read about the saxon:character-representation function, but right now I'm using Saxon-HE 9.4, so that function is not available to me, and I'm not even 100% sure it would do what I want.
I also read about use-character-maps. This seems to solve my problem, but I would rather not add a giant map to my transform to catch every possible character hex code.
<xsl:character-map name="characterMap">
<xsl:output-character character=" " string="&#xA0;"/>
<xsl:output-character character="¡" string="&#xA1;"/>
<!-- 93 more entries... ¡ through þ -->
<xsl:output-character character="ÿ" string="&#xFF;"/>
</xsl:character-map>
Are there any other ways to preserve character hex codes?
The XSLT processor doesn't know how the character was represented in the input - that's all handled by the XML parser. So it can't reproduce the original.
If you want to output all non-ASCII characters using numeric character references, regardless how they were represented in the input, try using xsl:output encoding="us-ascii".
If you really need to retain the original representation - and I can't see any defensible reason why anyone would need to do that - then try Andrew Welch's lexev, which converts all the entity and character references to processing instructions on the way in, and back to entity/character references on the way out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With