I need to parse a table from a web page. I've done this before using Ruby and Nokogiri but this time my method is not working. This is what I'm doing:
response = RestClient.get "http://www.webpage.com?page=0"
doc = Nokogiri::HTML(response.body,nil,'utf-8')
doc.remove_namespaces!
table = doc.xpath(".//*[@id='contsinderecha']/form/table/tbody/tr[4]/td/table/tbody/tr[5]/td/table")
table
is just an empty array. The response is fine, if I do a put response.body
I get the body of the webpage.
Also, to get the XPath I'm using firebug.
Any idea of what may be happening?
The solution to your problem is to get rid of the tbody
parts in your xPath, as suggested in "Why does this Nokogiri XPath have a null return?".
Firefox generated tbody
elements for you, which is why they appear in Firefox's xPath, but they are not part of the original page source.
Try the following:
response = RestClient.get "http://www.buenosaires.gob.ar/areas/seguridad_justicia/seguridad_urbana/estaciones_servicio/buscador.php?&pag=0"
doc = Nokogiri::HTML(response.body,nil,'utf-8')
doc.remove_namespaces!
table = doc.xpath(".//*[@id='contsinderecha']/form/table/tr[4]/td/table/tr[5]/td/table")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With