Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Substrings from iterable node

Please consider this sample file: http://www.w3schools.com/dom/books.xml

This XPath expression //title/text(), returns:

Everyday Italian
Harry Potter
XQuery Kick Start
Learning XML

Now I want just the first names, and try: tokenize(//title/text(),' ')[1], which returns:

Too many items

OTOH tokenize((//title/text())[1],' ')[1] returns first name for first node.

How can I get substrings with XPath while iterating nodes?

like image 588
theta Avatar asked Mar 24 '26 07:03

theta


1 Answers

Use:

//text()/tokenize(.,' ')[1]

This produces a sequence of the first "word" of every text node in the XML document.

XSLT 2.0 - based verification:

<xsl:stylesheet version="2.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>

 <xsl:template match="/">
     <xsl:sequence select="//text()/tokenize(.,' ')[1]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document:

<t>
    <a>Everyday Italian</a>
    <b>Harry Potter</b>
    <c>XQuery Kick Start</c>
    <d>Learning XML</d>
</t>

the XPath expression is evaluated and the result of this evaluation is copied to the output:

 Everyday 
 Harry 
 XQuery 
 Learning 

The above includes a few white-space only text nodes.

If you want to ignore any whitespace-only text node, change the XPath expression to:

//text()[normalize-space()]/tokenize(.,' ')[1]
like image 100
Dimitre Novatchev Avatar answered Mar 27 '26 02:03

Dimitre Novatchev



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!