Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Nokogiri to find element before another element

I have a partial HTML document:

<h2>Destinations</h2>
<div>It is nice <b>anywhere</b> but here.
<ul>
  <li>Florida</li>
  <li>New York</li>
</ul>
<h2>Shopping List</h2>
<ul>
  <li>Booze</li>
  <li>Bacon</li>
</ul>

On every <li> item, I want to know the category the item is in, e.g., the text in the <h2> tags.

This code does not work, but this is what I'm trying to do:

@page.search('li').each do |li|
  li.previous('h2').text
end
like image 269
Hamptonite Avatar asked Sep 03 '25 10:09

Hamptonite


2 Answers

Nokogiri allows you to use xpath expressions to locate an element:

categories = []

doc.xpath("//li").each do |elem|
  categories << elem.parent.xpath("preceding-sibling::h2").last.text
end

categories.uniq!
p categories

The first part looks for all "li" elements, then inside, we look for the parent (ul, ol), the for an element before (preceding-sibling) which is an h2. There can be more than one, so we take the last (ie, the one closest to the current position).

We need to call "uniq!" as we get the h2 for each 'li' (as the 'li' is the starting point).

Using your own HTML example, this code output:

["Destinations", "Shopping List"]
like image 76
Martin Avatar answered Sep 05 '25 00:09

Martin


You are close.

@page.search('li').each do |li|
  category = li.xpath('../preceding-sibling::h2').text
  puts "#{li.text}: category #{category}" 
end
like image 36
Mark Thomas Avatar answered Sep 05 '25 00:09

Mark Thomas