Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can an XML start with anything else than a "<"?

Tags:

string

file

xml

w3c

Can an XML start with anything other than a < character?

It was a random thought I just had, when I was trying to define how to differentiate a string containing a XML and one containing a path to a XML.

I believe the answer is no, but I'm looking to be certain.

like image 394
Kilazur Avatar asked Nov 06 '25 21:11

Kilazur


2 Answers

Only a < or a whitespace character can begin a well-formed XML document.

The W3C XML Recommendation includes a EBNF which definitively defines an XML document:

 [1] document ::= prolog element Misc*
[22] prolog   ::= XMLDecl? Misc* (doctypedecl Misc*)?
[23] XMLDecl  ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
[27] Misc     ::= Comment | PI | S
 [3] S        ::= (#x20 | #x9 | #xD | #xA)+

From these rules it follows that an XML document may start with a whitespace character or a < character from any one of the following constructs:

  • XML Declaration
  • Comment
  • PI
  • Doctype Declaration
  • Element

An XML document may start with no other character.

Notes:

  1. An implication of these rules is that if an XML document contains an XML declaration, it must appear at the top (or you could receive a somewhat cryptic error message). So, for XML documents with an XML declaration, the first character will have to be a < and cannot be whitespace.
  2. A BOM may appear at the beginning of an XML document entity to indicate the byte order of the character encoding being used. These two bytes are typically not considered to be part of the XML document itself but rather the storage unit of the physical structure supporting the XML document. A BOM, along with an XML declaration, assist XML processors in character encoding detection. [Suggestion for BOM mention thanks to JonHanna]
like image 135
kjhughes Avatar answered Nov 08 '25 14:11

kjhughes


A well-formed XML document entity always has "<" as its first non-whitespace character.

A well-formed external general parsed entity need not start with "<".

So if by "a XML" you mean "a well-formed XML document entity", then the answer is "no".

like image 22
Michael Kay Avatar answered Nov 08 '25 15:11

Michael Kay



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!