Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

innerHTML for character reference in attribute value is not preserved as escaped secuence. Any way to workaround?

I put such code to Firefox:

<div id="my">
    <div title="&lt;b&gt; title &lt;/b&gt;">&lt;b&gt; text &lt;/b&gt;</div>
</div>

After evaluation of:

document.getElementById("my").innerHTML

I got:

"<div title="<b> title </b>">&lt;b&gt; text &lt;/b&gt;</div>"

As you can see title attribute is damaged... Is this a bug?

Link to playground: http://jsfiddle.net/uGXQP/

UPDATE I run tests under Firefox and Opera - it reproduced. On Chrome it work as expected.

UPDATE 2 I found this issue then work with http://code.google.com/p/canvg/ project. It have main routine which require SVG as string as input argument. Natural way to do so is to use innerHTML...

So this library parse DOM as string and can't handle properly <> chars it stream in attribute value...

like image 773
gavenkoa Avatar asked Jan 25 '26 19:01

gavenkoa


1 Answers

The innerHTML property does not, in general, give you the markup in an HTML source document. Instead, it performs a serialization of the content of the element in the DOM. There is no standard on this yet, but the HTML5 CR specifies rules for serializing HTML fragments, and the DOM Parsing and Serialization WD defines innerHTML in terms of such serialization. This means that many parts of markup will be canonicalized, and this includes the principle that within an attribute value, the character “<” appears as such. If it was written as &lt; in HTML, it had been converted to “<” during parsing – the DOM does not have information about the original syntax used to represent “<”. Unfortunately, some browsers get this wrong.

Anyway, if you want an attribute value to contain the four characters &lt;, you need to write them so that the ampersand has been escaped: &amp;lt;.

like image 62
Jukka K. Korpela Avatar answered Jan 28 '26 12:01

Jukka K. Korpela