Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Both XPath/getChildElements failed to get XML child in XOM

Tags:

java

xml

scala

xom

I've to parse an OAI-PMH XML file, which looks like the following. I would like to iterate over all <record> nodes in ListRecord.

<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd" xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <responseDate>2010-12-30T10:46:39.654+08:00</responseDate>
  <request verb="ListRecords" metadataPrefix="oai_dc">http://172.16.1.118/ahd/oai2.do</request>
  <ListRecords>
    <record>
      <header>
        <identifier>9010402101001001</identifier>
      </header>
      <metadata>
        <oai_dc:dc xsi:schemaLocationfiltered="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/">
          <dc:identifier>9010402101001001</dc:identifier>
        </oai_dc:dc>
      </metadata>
    </record>
    <resumptionToken>1509/1509</resumptionToken>
  </ListRecords>
</OAI-PMH>

But when I using XOM 1.2.5 to get those node, no matter what method I use (query or getChildElements) it always return 0 nodes.

The following is the code I use in Scala interpreter:

scala> import nu.xom.Builder
import nu.xom.Builder

scala> val builder = new Builder
builder: nu.xom.Builder = nu.xom.Builder@6682d439

scala> val document = builder.build(new java.io.File("/home/brianhsu/qqq.xml"))
document: nu.xom.Document = [nu.xom.Document: OAI-PMH]

scala> document.query("//record").size
res0: Int = 0

scala> document.query("//ListRecords").size
res1: Int = 0

scala> document.getRootElement.getChildElements("ListRecords").size
res2: Int = 0

I've no idea why I could not get ListRecords and record in the XML. Did I miss something?

like image 715
Brian Hsu Avatar asked Feb 04 '26 09:02

Brian Hsu


2 Answers

I found this is a duplicate of XPath Expression returns nothing for //element, but //* returns a count

The following code works, I need to bind the tag name to a namespace.

scala> val context = new XPathContext("xsi", "http://www.openarchives.org/OAI/2.0/")
context: nu.xom.XPathContext = nu.xom.XPathContext@19a3f495

scala> document.query("//xsi:record", context).size
res6: Int = 1
like image 171
Brian Hsu Avatar answered Feb 05 '26 23:02

Brian Hsu


I'll wager that it is a xmlns issue -- have you tried using the domain parameter? Try:

 document.getRootElement
         .getChildElements("ListRecords", 
                           "http://www.openarchives.org/OAI/2.0/").size

Basically, many languages, when given a default ns on an XML object, will require that namespace to look that node up -- even if it is not prefixed in the outputted DOM itself.

(This can also be done using the XPathContext object, as illustrated by Brian Hsu)

like image 33
cwallenpoole Avatar answered Feb 05 '26 23:02

cwallenpoole



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!