Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What character encoding is >?

In HTML, you can write the greater than sign ">" as >

and the less than symbol "<" as &lt;.

Is this encoding defined by the HTML encoding or some standard like ISO, UTF-xxx, BaseXXX, etc?

like image 251
Johann Avatar asked Sep 11 '25 05:09

Johann


2 Answers

This might answer your question. Basically it is HTML encoding for a few predefined characters.

Characters like &gt; and &amp; are HTML Entities specifically, they are Named HTML Entities

like image 157
Ethan Avatar answered Sep 13 '25 22:09

Ethan


It is not an encoding at all. Even informally, it is more often called “escape notation” or something like that, not an encoding.

Since the question seems to be just about the name of the construct, here are the correct terms:

  • In SGML, which is what HTML is formally based on up to HTML 4.01, the term is “entity reference”. It so happens that all entities predefined in HTML expand each to a single character, and this is why they are often called “character references” or something like that, but that’s informal. Even HTML 4.01 calls them “character references”, but normatively HTML 4.01 cites SGML, so that’s to be regarded as informal. In SGML, only notations that refer to characters by their code numbers, such as &#62;, are called character references.
  • In XML, which is what XHTML is based on, the term is “entity reference”, too.
  • In HTML5, the term is “named character reference”.
like image 32
Jukka K. Korpela Avatar answered Sep 13 '25 22:09

Jukka K. Korpela