Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I extract any text preceding a certain node using XPath 1.0?

I need a single XPath expression to select any text preceding a node, regardless of the structure and hierarchy. For example, how do I extract text before the node <target/> in the following cases:

Case 1:

<a>1</a>
<b>2</b>
<target/>

Expected result: 2

Case 2:

<p>1</p>
<do>
  <bt>2</bt>
</do>
<target/>

Expected result: 2

Case 3:

<aa>Text <b>child text</b></aa>
<target/>

Expected result: 'child text' or 'Text child text'

Case 4:

<p>Text <b>child text</b> tail</p>
<target/>

Expected result: 'tail', 'text tail' or 'text child text tail'

And so on, there can be as many cases as possible. Actually, all that I want is the last character of the preceding text, so it doesn't matter whether the result contains text from any nested intermediate child elements.

like image 746
Cuder Avatar asked Dec 08 '25 11:12

Cuder


1 Answers

//target/preceding::text()[normalize-space(.) != ''][1]

[1] instead of [last()] because preceding orders the nodes backwards. And [normalize-space(.) != ''] because we don't want text nodes consisting only of whitespace.

like image 133
Björn Tantau Avatar answered Dec 11 '25 01:12

Björn Tantau



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!