Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can I pull information after a keyword

Tags:

xpath

scrapy

i am running a spider that is pulling information like prices and shipping ... I am getting the shipping information back like this "Shipping:$.99,Shipping:,Shipping:,Shipping:$.49" .... the code that is extracting it looks like this

item["shipping"] = vendor.xpath("normalize-space(.//span[@class='shippingAmount']/text())").extract()

can i write this line to pull just the price after the "Shipping:" ?

like image 691
six7zero9 Avatar asked Jan 26 '26 09:01

six7zero9


2 Answers

Use a combination of substring-after and substring-before, ie.

substring-before(
  substring-after(
    "Shipping:$.99,Shipping:,Shipping:,Shipping:$.49",
    "Shipping:"),
  ","
)

In XPath 1.0, there is no way to fetch all shipping amounts for an arbitrary number of shipping fees. You could query the 2nd, 3td, ... value by repeatedly calling substring-after($string, "Shipping:") to remove the former value.

(Linebreaks can be omitted, of course.)

like image 60
Jens Erat Avatar answered Jan 28 '26 07:01

Jens Erat


You can extract the prices using some regular expression :

import re 
str = "Shipping:$.99,Shipping:,Shipping:,Shipping:$.49"
re.findall(r'[\d+[.]]?\d+', str)
['.99', '.49']

EDIT

To have 0 if there is no shipping:

[float(x) if x else 0 for x in re.sub('Shipping:[$]?','',str).split(',')]
[0.99, 0, 0, 0.49]
like image 45
agstudy Avatar answered Jan 28 '26 08:01

agstudy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!