Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse XML that has a node with multiple values with Nokogiri

I am not quite sure what the XML syntax is that's why I will put two types of XML, please point out the good one.

I have XML that has a node with multiple values:

Case 1:

<items>
  <item>
    <image_urls>http://static.elefant.ro/images/26/95226/husa-belkin-grip-pentru-kindle-3-ebook-reader-albastru_1_categorie.jpg http://www.keenthemes.com/preview/metronic/theme/assets/global/plugins/jcrop/demos/demo_files/image1.jpg
    </image_urls>
  </item>
</items>

Case 2:

<items>
  <item>
    <image_urls>
http://static.elefant.ro/images/26/95226/husa-belkin-grip-pentru-kindle-3-ebook-reader-albastru_1_categorie.jpg
    </image_urls>
    <image_urls>http://www.keenthemes.com/preview/metronic/theme/assets/global/plugins/jcrop/demos/demo_files/image1.jpg
    </image_urls>
  </item>
</items>

The struggle I am facing is getting the multiple-valued node with Nokogiri. I tried:

item.at("image_urls").to_s.split(" ").inject([]) { |result, element| 
  result << element 
}

But this works only in the first variant of the XML. If the correct syntax is the second form, which I believe it is, how can I take both values, as my following implementation only takes the first?

xml = Nokogiri::XML(File.open(self.file.current_path))
xml.xpath("//item").each do |item|
attachments_array = item.at("image_urls").inject([]) { |result, element| 
  result << element
}
like image 268
Lucian Tarna Avatar asked Feb 01 '26 07:02

Lucian Tarna


1 Answers

You need to use the css method, which returns all matches, as opposed to at which returns only the first match:

text = <<EOD
<items>
  <item>
    <image_urls>http://static.elefant.ro/images/26/95226/husa-belkin-grip-pentru-kindle-3-ebook-reader-albastru_1_categorie.jpg http://www.keenthemes.com/preview/metronic/theme/assets/global/plugins/jcrop/demos/demo_files/image1.jpg
    </image_urls>
  </item>
  <item>
    <image_urls>
http://static.elefant.ro/images/26/95226/husa-belkin-grip-pentru-kindle-3-ebook-reader-albastru_1_categorie.jpg</image_urls>
<image_urls>http://www.keenthemes.com/preview/metronic/theme/assets/global/plugins/jcrop/demos/demo_files/image1.jpg
    </image_urls>
  </item>
</items>
EOD

xml = Nokogiri::XML(text)

xml.css('item').each do |item|
  attachments = item.css('image_urls').map do |url| 
    url.text.strip!.split(' ')
  end.flatten
  p attachments
end
# ["http://static.elefant.ro/images/26/95226/husa-belkin-grip-pentru-kindle-3-ebook-reader-albastru_1_categorie.jpg", "http://www.keenthemes.com/preview/metronic/theme/assets/global/plugins/jcrop/demos/demo_files/image1.jpg"]
# ["http://static.elefant.ro/images/26/95226/husa-belkin-grip-pentru-kindle-3-ebook-reader-albastru_1_categorie.jpg", "http://www.keenthemes.com/preview/metronic/theme/assets/global/plugins/jcrop/demos/demo_files/image1.jpg"]
like image 57
Alexey Shein Avatar answered Feb 02 '26 23:02

Alexey Shein