Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to validate xml?

Tags:

c#

xml

I need to validate as fast as possible and receive the next xml-data on socket.

I am using this method to validate received xml-datas.

private validateRecievedXmlCallback()
{
  try
  {      
    XmlReader xreader = XmlReader.Create(new StringReader(xmlData));
    while (xreader.Read()) ;
  }
  catch (Exception)
  {
    return false;
  }

  return true;
}

But I think this method is not efficient enought. I actually need to check the last tag only.

example:

<test valueA="1" valueB="2">
   <data valueC="1" />
   <data valueC="5" />
   <data valueC="5">220</data>
</test>  //I need to check if </test> tag closed, but whats the best way to do it?
like image 307
Racooon Avatar asked Dec 02 '25 12:12

Racooon


2 Answers

If you stick with the XmlReader, you could use XmlReader.Skip() which, well, skips the content of the current element.

So

xreader.ReadStartElement("test"); // moves to document root, throws if it is not <test>
xreader.Skip(); // throws if document is not well-formed, e.g. root has no closing tag.

As other commenters have stated already, there is no good way of guaranteeing well-formedness of a XML document except for using a XML parser.

like image 159
Sven Künzler Avatar answered Dec 04 '25 02:12

Sven Künzler


Anyone actually facing the same challenge as the OP: Refer to the answer by Sven Künzler and never think about building your own XML "validation" again.


Edit: Added self-closing tag regex check.

Edit2: Made regex actually do what it's supposed to

Edit3: Edit double-closed tag check (hat tip to RichardW1001)

private validateRecievedXmlCallback(IAsyncResult ar)
{
    string sPattern = @"^<test([^>]*) \/>$";
    if (System.Text.RegularExpressions.Regex.IsMatch(xmlData, sPattern))
    {
        return(true);
    }

    int first_occurence = xmlData.IndexOf("</test>");
    int last_occurence  = xmlData.LastIndexOf("</test>");
    return((first_occurence != -1) && (first_occurence == last_occurence));
}

Disclaimer: It is generally a stupid idea to try and "validate" XML by means of regex, IndexOf() or any other "homegrown" methods. Just use a proper XML parser.

like image 27
vzwick Avatar answered Dec 04 '25 02:12

vzwick



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!